[jira] [Commented] (SOLR-16654) Add support for node-level caches
[ https://issues.apache.org/jira/browse/SOLR-16654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850547#comment-17850547 ] David Smiley commented on SOLR-16654: - [~magibney] the test has been failing about 0.5% of the time since when this was committed: [http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.search.TestThinCache.testSimple] (we can't re-open this because it shipped but can just post a JIRA-less PR) > Add support for node-level caches > - > > Key: SOLR-16654 > URL: https://issues.apache.org/jira/browse/SOLR-16654 > Project: Solr > Issue Type: New Feature >Affects Versions: main (10.0) >Reporter: Michael Gibney >Priority: Minor > Fix For: 9.4 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > Caches are currently configured only at the level of individual cores, sized > according to expected usage patterns for the core. > The main tradeoff in cache sizing is heap space, which is of course limited > at the JVM/node level. Thus there is a conflict between sizing cache to > per-core use patterns vs. sizing cache to enforce limits on overall heap > usage. > This issue proposes some minor changes to facilitate the introduction of > node-level caches: > # support a {{}} node in {{solr.xml}}, to parse named cache configs, > for caches to be instantiated/accessible at the level of {{CoreContainer}}. > The syntax of this config node would be identical to the syntax of the "user > caches" config in {{solrconfig.xml}}. > # provide a hook in searcher warming to initialize core-level caches with the > initial associated searcher. (analogous to {{warm()}}, but for the initial > searcher -- see SOLR-16017, which fwiw was initially opened to support a > different use case that requires identical functionality). > Part of the appeal of this approach is that the above (minimal) changes are > the only changes required to enable pluggable node-level cache > implementations -- i.e. no further API changes are necessary, and no > behavioral changes are introduced for existing code. > Note: I anticipate that the functionality enabled by nodel-level caches will > mainly be useful for enforcing global resource limits -- it is not primarily > expected to be used for sharing entries across different cores/searchers > (although such use would be possible). > Initial use cases envisioned: > # "thin" core-level caches (filterCache, queryResultCache, etc.) backed by > "node-level" caches. > # dynamic (i.e. not static-"firstSeacher") warming of OrdinalMaps, by > placing OrdinalMaps in an actual cache with, e.g., a time-based expiration > policy. > This functionality would be particularly useful for cases with many cores per > node, and even more so in cases with uneven core usage patterns. But having > the ability to configure resource limits at a level that directly corresponds > to the available resources (i.e., node-level) would be generally useful for > all cases. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-16093) Tests should not require a working IPv6 networking stack
[ https://issues.apache.org/jira/browse/SOLR-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-16093: Summary: Tests should not require a working IPv6 networking stack (was: HttpClient does not gracefully handle IPv6) > Tests should not require a working IPv6 networking stack > > > Key: SOLR-16093 > URL: https://issues.apache.org/jira/browse/SOLR-16093 > Project: Solr > Issue Type: Test > Components: Tests >Reporter: Mike Drob >Assignee: David Smiley >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > I was running tests inside of a docker container (trying to parallelize some > stuff in a different way) and likely had my networking set up incorrectly. > This was with JDK17. > I'm not sure how the IPv6 shard addresses got in there, maybe that what Solr > decided to register in zookeeper, or maybe it was an artifact of my docker > container doing some weird translation. > {{shards=http://127.0.0.1:41629/x_bm/lr/collection1|[::1]:4/x_bm/lr|[::1]:6/x_bm/lr,http://127.0.0.1:44693/x_bm/lr/collection1,[::1]:4/x_bm/lr|http://127.0.0.1:44741/x_bm/lr/collection1}} > {noformat} > 2> 88712 INFO (qtp1293439783-64) [ x:collection1] o.a.s.c.S.Request > webapp=/x_bm/lr path=/select > params={q=id:42=http://127.0.0.1:41629/x_bm/lr/collection1|[::1]:4/x_bm/lr|[::1]:6/x_bm/lr,http://127.0.0.1:44693/x_bm/lr/collection1,[::1]:4/x_bm/lr|http://127.0.0.1:44741/x_bm/lr/collection1=0=javabin=2} > status=500 QTime=252 > 2> 88716 ERROR (qtp1293439783-64) [ x:collection1] o.a.s.s.HttpSolrCall > org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2> => org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2>at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:504) > 2> org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2>at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:504) > ~[main/:?] > 2>at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:230) > ~[main/:?] > 2>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2866) > ~[main/:?] > 2>at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:881) [main/:?] > 2>at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:600) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:234) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:202) > [main/:?] > 2>at > org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:257) > [main/:?] > 2>at > org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:227) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179) > [main/:?] > 2>at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:187) > [main/:?] > 2>at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at >
[jira] [Updated] (SOLR-16093) HttpClient does not gracefully handle IPv6
[ https://issues.apache.org/jira/browse/SOLR-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-16093: Component/s: Tests Priority: Minor (was: Major) > HttpClient does not gracefully handle IPv6 > -- > > Key: SOLR-16093 > URL: https://issues.apache.org/jira/browse/SOLR-16093 > Project: Solr > Issue Type: Test > Components: Tests >Reporter: Mike Drob >Assignee: David Smiley >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > I was running tests inside of a docker container (trying to parallelize some > stuff in a different way) and likely had my networking set up incorrectly. > This was with JDK17. > I'm not sure how the IPv6 shard addresses got in there, maybe that what Solr > decided to register in zookeeper, or maybe it was an artifact of my docker > container doing some weird translation. > {{shards=http://127.0.0.1:41629/x_bm/lr/collection1|[::1]:4/x_bm/lr|[::1]:6/x_bm/lr,http://127.0.0.1:44693/x_bm/lr/collection1,[::1]:4/x_bm/lr|http://127.0.0.1:44741/x_bm/lr/collection1}} > {noformat} > 2> 88712 INFO (qtp1293439783-64) [ x:collection1] o.a.s.c.S.Request > webapp=/x_bm/lr path=/select > params={q=id:42=http://127.0.0.1:41629/x_bm/lr/collection1|[::1]:4/x_bm/lr|[::1]:6/x_bm/lr,http://127.0.0.1:44693/x_bm/lr/collection1,[::1]:4/x_bm/lr|http://127.0.0.1:44741/x_bm/lr/collection1=0=javabin=2} > status=500 QTime=252 > 2> 88716 ERROR (qtp1293439783-64) [ x:collection1] o.a.s.s.HttpSolrCall > org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2> => org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2>at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:504) > 2> org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2>at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:504) > ~[main/:?] > 2>at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:230) > ~[main/:?] > 2>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2866) > ~[main/:?] > 2>at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:881) [main/:?] > 2>at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:600) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:234) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:202) > [main/:?] > 2>at > org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:257) > [main/:?] > 2>at > org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:227) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179) > [main/:?] > 2>at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:187) > [main/:?] > 2>at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927]
[jira] [Commented] (SOLR-16093) HttpClient does not gracefully handle IPv6
[ https://issues.apache.org/jira/browse/SOLR-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850206#comment-17850206 ] David Smiley commented on SOLR-16093: - I spent some time with this today, finally able to reproduce the issue in a Docker container locally. IPV6 isn't supported on this OS/JVM/various-settings etc. Java doesn't mandate that ipv6 be supported, I presume, yet certain tests are testing non-delivery/dead-hosts using ipv6 addresses in SolrTestCaseJ4.DEAD_HOST_1 (and 2 and 3). I think we should simply use 127.0.0.1 here. I'll post a PR tonight. > HttpClient does not gracefully handle IPv6 > -- > > Key: SOLR-16093 > URL: https://issues.apache.org/jira/browse/SOLR-16093 > Project: Solr > Issue Type: Test >Reporter: Mike Drob >Assignee: David Smiley >Priority: Major > > I was running tests inside of a docker container (trying to parallelize some > stuff in a different way) and likely had my networking set up incorrectly. > This was with JDK17. > I'm not sure how the IPv6 shard addresses got in there, maybe that what Solr > decided to register in zookeeper, or maybe it was an artifact of my docker > container doing some weird translation. > {{shards=http://127.0.0.1:41629/x_bm/lr/collection1|[::1]:4/x_bm/lr|[::1]:6/x_bm/lr,http://127.0.0.1:44693/x_bm/lr/collection1,[::1]:4/x_bm/lr|http://127.0.0.1:44741/x_bm/lr/collection1}} > {noformat} > 2> 88712 INFO (qtp1293439783-64) [ x:collection1] o.a.s.c.S.Request > webapp=/x_bm/lr path=/select > params={q=id:42=http://127.0.0.1:41629/x_bm/lr/collection1|[::1]:4/x_bm/lr|[::1]:6/x_bm/lr,http://127.0.0.1:44693/x_bm/lr/collection1,[::1]:4/x_bm/lr|http://127.0.0.1:44741/x_bm/lr/collection1=0=javabin=2} > status=500 QTime=252 > 2> 88716 ERROR (qtp1293439783-64) [ x:collection1] o.a.s.s.HttpSolrCall > org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2> => org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2>at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:504) > 2> org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2>at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:504) > ~[main/:?] > 2>at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:230) > ~[main/:?] > 2>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2866) > ~[main/:?] > 2>at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:881) [main/:?] > 2>at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:600) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:234) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:202) > [main/:?] > 2>at > org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:257) > [main/:?] > 2>at > org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:227) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179) > [main/:?] > 2>at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:187) > [main/:?] > 2>at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) >
[jira] [Assigned] (SOLR-16093) HttpClient does not gracefully handle IPv6
[ https://issues.apache.org/jira/browse/SOLR-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley reassigned SOLR-16093: --- Assignee: David Smiley > HttpClient does not gracefully handle IPv6 > -- > > Key: SOLR-16093 > URL: https://issues.apache.org/jira/browse/SOLR-16093 > Project: Solr > Issue Type: Test >Reporter: Mike Drob >Assignee: David Smiley >Priority: Major > > I was running tests inside of a docker container (trying to parallelize some > stuff in a different way) and likely had my networking set up incorrectly. > This was with JDK17. > I'm not sure how the IPv6 shard addresses got in there, maybe that what Solr > decided to register in zookeeper, or maybe it was an artifact of my docker > container doing some weird translation. > {{shards=http://127.0.0.1:41629/x_bm/lr/collection1|[::1]:4/x_bm/lr|[::1]:6/x_bm/lr,http://127.0.0.1:44693/x_bm/lr/collection1,[::1]:4/x_bm/lr|http://127.0.0.1:44741/x_bm/lr/collection1}} > {noformat} > 2> 88712 INFO (qtp1293439783-64) [ x:collection1] o.a.s.c.S.Request > webapp=/x_bm/lr path=/select > params={q=id:42=http://127.0.0.1:41629/x_bm/lr/collection1|[::1]:4/x_bm/lr|[::1]:6/x_bm/lr,http://127.0.0.1:44693/x_bm/lr/collection1,[::1]:4/x_bm/lr|http://127.0.0.1:44741/x_bm/lr/collection1=0=javabin=2} > status=500 QTime=252 > 2> 88716 ERROR (qtp1293439783-64) [ x:collection1] o.a.s.s.HttpSolrCall > org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2> => org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2>at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:504) > 2> org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2>at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:504) > ~[main/:?] > 2>at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:230) > ~[main/:?] > 2>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2866) > ~[main/:?] > 2>at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:881) [main/:?] > 2>at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:600) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:234) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:202) > [main/:?] > 2>at > org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:257) > [main/:?] > 2>at > org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:227) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179) > [main/:?] > 2>at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:187) > [main/:?] > 2>at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501) >
[jira] [Commented] (ZOOKEEPER-4835) Netty should be an optional dependency
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17850052#comment-17850052 ] David Smiley commented on ZOOKEEPER-4835: - Note that TLS can be achieved completely independently via Mesh/Istio. {quote} that the zookeeper client really works without any Netty {quote} It's not clear how to make a "client" vs "server" distinction within the codebase to enforce that it's okay for the "server" to talk to Netty but not the "client". Plenty of packages are in common (utils). Anyway, as the ArchUnit test shows, it's really pretty close to making a class name based distinction of which classes can talk to Netty. > Netty should be an optional dependency > -- > > Key: ZOOKEEPER-4835 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4835 > Project: ZooKeeper > Issue Type: Improvement >Reporter: David Smiley >Priority: Major > Attachments: zk-netty-violations.txt > > > ZK should not mandate the inclusion of Netty if Netty features aren't being > used. There are very few usages of Netty from ZK files that are not named > Netty, so this looks pretty easy. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (ZOOKEEPER-4835) Netty should be an optional dependency
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated ZOOKEEPER-4835: Attachment: zk-netty-violations.txt > Netty should be an optional dependency > -- > > Key: ZOOKEEPER-4835 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4835 > Project: ZooKeeper > Issue Type: Improvement >Reporter: David Smiley >Priority: Major > Attachments: zk-netty-violations.txt > > > ZK should not mandate the inclusion of Netty if Netty features aren't being > used. There are very few usages of Netty from ZK files that are not named > Netty, so this looks pretty easy. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (ZOOKEEPER-4835) Netty should be an optional dependency
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849897#comment-17849897 ] David Smiley commented on ZOOKEEPER-4835: - I attempted to conclusively confirm how viable excluding Netty might be. I came upon ArchUnit[1] and wrote a little test[2] to look for ZK classes that don't have Netty in the name yet call such a class or that which call Netty directly. The results show X509 utilities are intertwined with Netty. And furthermore the methods ZooKeeper.getClientCnxnSocket and ZooKeeperServer.getOutstandingHandshakeNum and UnifiedServerSocket$UnifiedSocket.detectMode() refer to Netty. Thus it seems not safe to exclude Netty today but if the ZK project wanted to invest in better separation, it seems close at hand. [1] ArchUnit: https://www.archunit.org/ [2] my test: https://gist.github.com/dsmiley/8a34cf16dd5827e5396e6da24e19afd2 > Netty should be an optional dependency > -- > > Key: ZOOKEEPER-4835 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4835 > Project: ZooKeeper > Issue Type: Improvement >Reporter: David Smiley >Priority: Major > > ZK should not mandate the inclusion of Netty if Netty features aren't being > used. There are very few usages of Netty from ZK files that are not named > Netty, so this looks pretty easy. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4835) Netty should be an optional dependency
David Smiley created ZOOKEEPER-4835: --- Summary: Netty should be an optional dependency Key: ZOOKEEPER-4835 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4835 Project: ZooKeeper Issue Type: Improvement Reporter: David Smiley ZK should not mandate the inclusion of Netty if Netty features aren't being used. There are very few usages of Netty from ZK files that are not named Netty, so this looks pretty easy. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (ZOOKEEPER-4835) Netty should be an optional dependency
David Smiley created ZOOKEEPER-4835: --- Summary: Netty should be an optional dependency Key: ZOOKEEPER-4835 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4835 Project: ZooKeeper Issue Type: Improvement Reporter: David Smiley ZK should not mandate the inclusion of Netty if Netty features aren't being used. There are very few usages of Netty from ZK files that are not named Netty, so this looks pretty easy. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (SOLR-16093) HttpClient does not gracefully handle IPv6
[ https://issues.apache.org/jira/browse/SOLR-16093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849494#comment-17849494 ] David Smiley commented on SOLR-16093: - I see the same problem! JDK 17, Solr 9.4. Docker with RHEL 9.3 if that matters. I wish I could reproduce this locally > HttpClient does not gracefully handle IPv6 > -- > > Key: SOLR-16093 > URL: https://issues.apache.org/jira/browse/SOLR-16093 > Project: Solr > Issue Type: Test >Reporter: Mike Drob >Priority: Major > > I was running tests inside of a docker container (trying to parallelize some > stuff in a different way) and likely had my networking set up incorrectly. > This was with JDK17. > I'm not sure how the IPv6 shard addresses got in there, maybe that what Solr > decided to register in zookeeper, or maybe it was an artifact of my docker > container doing some weird translation. > {{shards=http://127.0.0.1:41629/x_bm/lr/collection1|[::1]:4/x_bm/lr|[::1]:6/x_bm/lr,http://127.0.0.1:44693/x_bm/lr/collection1,[::1]:4/x_bm/lr|http://127.0.0.1:44741/x_bm/lr/collection1}} > {noformat} > 2> 88712 INFO (qtp1293439783-64) [ x:collection1] o.a.s.c.S.Request > webapp=/x_bm/lr path=/select > params={q=id:42=http://127.0.0.1:41629/x_bm/lr/collection1|[::1]:4/x_bm/lr|[::1]:6/x_bm/lr,http://127.0.0.1:44693/x_bm/lr/collection1,[::1]:4/x_bm/lr|http://127.0.0.1:44741/x_bm/lr/collection1=0=javabin=2} > status=500 QTime=252 > 2> 88716 ERROR (qtp1293439783-64) [ x:collection1] o.a.s.s.HttpSolrCall > org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2> => org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2>at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:504) > 2> org.apache.solr.common.SolrException: > org.apache.solr.client.solrj.SolrServerException: Unsupported address type > 2>at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:504) > ~[main/:?] > 2>at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:230) > ~[main/:?] > 2>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2866) > ~[main/:?] > 2>at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:881) [main/:?] > 2>at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:600) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.dispatch(SolrDispatchFilter.java:234) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.lambda$doFilter$0(SolrDispatchFilter.java:202) > [main/:?] > 2>at > org.apache.solr.servlet.ServletUtils.traceHttpRequestExecution2(ServletUtils.java:257) > [main/:?] > 2>at > org.apache.solr.servlet.ServletUtils.rateLimitRequest(ServletUtils.java:227) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) > [main/:?] > 2>at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179) > [main/:?] > 2>at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:187) > [main/:?] > 2>at > org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548) > [jetty-servlet-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1434) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) > [jetty-server-9.4.44.v20210927.jar:9.4.44.v20210927] > 2>at >
[jira] [Commented] (SOLR-13681) make Lucene's index sorting directly configurable in Solr
[ https://issues.apache.org/jira/browse/SOLR-13681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849358#comment-17849358 ] David Smiley commented on SOLR-13681: - I noticed this ticket is "minor" priority but if it elevates index sorting to something we can all know/understand and use easily – it's a big win > make Lucene's index sorting directly configurable in Solr > - > > Key: SOLR-13681 > URL: https://issues.apache.org/jira/browse/SOLR-13681 > Project: Solr > Issue Type: New Feature >Reporter: Christine Poerschke >Priority: Minor > Attachments: SOLR-13681-refguide-skel.patch, SOLR-13681.patch > > Time Spent: 3h > Remaining Estimate: 0h > > History/Background: > * SOLR-5730 made Lucene's SortingMergePolicy and > EarlyTerminatingSortingCollector configurable in Solr 6.0 or later. > * LUCENE-6766 make index sorting a first-class citizen in Lucene 6.2 or later. > Current status: > * In Solr 8.2 use of index sorting is only available via configuration of a > (top-level) merge policy that is a SortingMergePolicy and that policy's sort > is then passed to the index writer config via the > {code} > if (mergePolicy instanceof SortingMergePolicy) { > Sort indexSort = ((SortingMergePolicy) mergePolicy).getSort(); > iwc.setIndexSort(indexSort); > } > {code} > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.2.0/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L241-L244 > code path. > Proposed change: > * in-scope for this ticket: To add direct support for index sorting > configuration in Solr. > * out-of-scope for this ticket: deprecation and removal of SortingMergePolicy > support -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17303) CVE-2023-39410: Upgrade to apache-avro version 1.11.3
[ https://issues.apache.org/jira/browse/SOLR-17303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849289#comment-17849289 ] David Smiley commented on SOLR-17303: - The chain that you looks wrong – snappy-java only has a test dependency on Hadoop-common – [see the pom|https://repo1.maven.org/maven2/org/xerial/snappy/snappy-java/1.1.10.1/snappy-java-1.1.10.1.pom]. I looked in our Docker image, which has the same stuff as our normal download bistro. Found nothing; you can try this yourself: {{docker run --rm solr:9.6 find /opt/ -name '*avro*jar'}} > CVE-2023-39410: Upgrade to apache-avro version 1.11.3 > - > > Key: SOLR-17303 > URL: https://issues.apache.org/jira/browse/SOLR-17303 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: security >Affects Versions: 9.6 >Reporter: Sujeet Hinge >Priority: Major > > CVE-2023-39410: Upgrade Apache-Avro version to 1.11.3 > When deserializing untrusted or corrupted data, it is possible for a reader > to consume memory beyond the allowed constraints and thus lead to out of > memory on the system. This issue affects Java applications using Apache Avro > Java SDK up to and including 1.11.2. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-15438) Refactor: Simplify SolrDispatchFilter close/destroy
[ https://issues.apache.org/jira/browse/SOLR-15438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-15438: Resolution: Duplicate Status: Resolved (was: Patch Available) Closing as the ideas here are mostly addressed in a fix for SOLR-17118 and somewhat also by SOLR-15590 > Refactor: Simplify SolrDispatchFilter close/destroy > --- > > Key: SOLR-15438 > URL: https://issues.apache.org/jira/browse/SOLR-15438 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: David Smiley >Priority: Minor > Time Spent: 1.5h > Remaining Estimate: 0h > > SolrDispatchFilter's close process is more convoluted than it needs to be. > There is conditionality via a boolean closeOnDestory that JettySolrRunner > uses, yet it seems it doesn't really need this logic. JSR could instead call > Jetty FilterHolder stop() method which tracks lifecycle to know if it hasn't > been called, and it can skip needless null checks. Also SDF's reference to > CoreContainer needn't be null'ed out, which makes some logic simpler above > that needn't guard against null. The HttpClient needn't be null'ed either. > We don't need a reference to SolrMetricManager; it can be gotten from > CoreContainer easily. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-10654: Description: Expose metrics via a `wt=prometheus` response type. Example scape_config in prometheus.yml: {code:java} scrape_configs: - job_name: 'solr' metrics_path: '/solr/admin/metrics' params: wt: ["prometheus"] static_configs: - targets: ['localhost:8983'] {code} Rationale for having this despite the "Prometheus Exporter". They have different strengths and weaknesses. was: Expose metrics via a `wt=prometheus` response type. Example scape_config in prometheus.yml: {code} scrape_configs: - job_name: 'solr' metrics_path: '/solr/admin/metrics' params: wt: ["prometheus"] static_configs: - targets: ['localhost:8983'] {code} > Expose Metrics in Prometheus format DIRECTLY from Solr > -- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Keith Laban >Priority: Major > Attachments: prometheus_metrics.txt > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Expose metrics via a `wt=prometheus` response type. > Example scape_config in prometheus.yml: > {code:java} > scrape_configs: > - job_name: 'solr' > metrics_path: '/solr/admin/metrics' > params: > wt: ["prometheus"] > static_configs: > - targets: ['localhost:8983'] > {code} > Rationale for having this despite the "Prometheus Exporter". They have > different strengths and weaknesses. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-10654: Description: Expose metrics via a `wt=prometheus` response type. Example scape_config in prometheus.yml: {code:java} scrape_configs: - job_name: 'solr' metrics_path: '/solr/admin/metrics' params: wt: ["prometheus"] static_configs: - targets: ['localhost:8983'] {code} [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423] for having this despite the "Prometheus Exporter". They have different strengths and weaknesses. was: Expose metrics via a `wt=prometheus` response type. Example scape_config in prometheus.yml: {code:java} scrape_configs: - job_name: 'solr' metrics_path: '/solr/admin/metrics' params: wt: ["prometheus"] static_configs: - targets: ['localhost:8983'] {code} Rationale for having this despite the "Prometheus Exporter". They have different strengths and weaknesses. > Expose Metrics in Prometheus format DIRECTLY from Solr > -- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Keith Laban >Priority: Major > Attachments: prometheus_metrics.txt > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Expose metrics via a `wt=prometheus` response type. > Example scape_config in prometheus.yml: > {code:java} > scrape_configs: > - job_name: 'solr' > metrics_path: '/solr/admin/metrics' > params: > wt: ["prometheus"] > static_configs: > - targets: ['localhost:8983'] > {code} > [Rationale|https://issues.apache.org/jira/browse/SOLR-11795?focusedCommentId=17261423=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17261423] > for having this despite the "Prometheus Exporter". They have different > strengths and weaknesses. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16505) Switch UpdateShardHandler.getRecoveryOnlyHttpClient to Jetty HTTP2
[ https://issues.apache.org/jira/browse/SOLR-16505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17849109#comment-17849109 ] David Smiley commented on SOLR-16505: - Can this be resolved again? If so, please do it Sanjay. > Switch UpdateShardHandler.getRecoveryOnlyHttpClient to Jetty HTTP2 > -- > > Key: SOLR-16505 > URL: https://issues.apache.org/jira/browse/SOLR-16505 > Project: Solr > Issue Type: Sub-task >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Fix For: 9.7 > > Time Spent: 9h 10m > Remaining Estimate: 0h > > This method and its callers (only RecoveryStrategy) should be converted to a > Jetty HTTP2 client. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17118) Solr deadlock during servlet container start
[ https://issues.apache.org/jira/browse/SOLR-17118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848941#comment-17848941 ] David Smiley commented on SOLR-17118: - The problem [strikes again|https://issues.apache.org/jira/browse/SOLR-17300?focusedCommentId=17848866=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17848866]. As long as this PR of mine seems pretty stable, I'm inclined to merge it to main and then monitor builds closely for spooky stuff. It's better than the current alternative. > Solr deadlock during servlet container start > > > Key: SOLR-17118 > URL: https://issues.apache.org/jira/browse/SOLR-17118 > Project: Solr > Issue Type: Bug > Components: Server >Affects Versions: 9.2.1 >Reporter: Andreas Hubold >Assignee: David Smiley >Priority: Major > Labels: deadlock, servlet-context > Time Spent: 1h 40m > Remaining Estimate: 0h > > In rare cases, Solr can run into a deadlock when started. The servlet > container startup thread gets blocked and there's no other thread that could > unblock it: > {noformat} > "main" #1 prio=5 os_prio=0 cpu=5922.39ms elapsed=7490.27s > tid=0x7f637402ae70 nid=0x47 waiting on condition [0x7f6379488000] >java.lang.Thread.State: WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base@17.0.9/Native Method) > - parking to wait for <0x81da8000> (a > java.util.concurrent.CountDownLatch$Sync) > at java.util.concurrent.locks.LockSupport.park(java.base@17.0.9/Unknown > Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@17.0.9/Unknown > Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@17.0.9/Unknown > Source) > at java.util.concurrent.CountDownLatch.await(java.base@17.0.9/Unknown > Source) > at > org.apache.solr.servlet.CoreContainerProvider$ContextInitializationKey.waitForReadyService(CoreContainerProvider.java:523) > at > org.apache.solr.servlet.CoreContainerProvider$ServiceHolder.getService(CoreContainerProvider.java:562) > at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:148) > at > org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:133) > at > org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$2(ServletHandler.java:725) > at > org.eclipse.jetty.servlet.ServletHandler$$Lambda$315/0x7f62fc2674b8.accept(Unknown > Source) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(java.base@17.0.9/Unknown > Source) > at > java.util.stream.Streams$ConcatSpliterator.forEachRemaining(java.base@17.0.9/Unknown > Source) > at > java.util.stream.ReferencePipeline$Head.forEach(java.base@17.0.9/Unknown > Source) > at > org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:749) > at > org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:392) > > {noformat} > ContextInitializationKey.waitForReadyService should have been unblocked by > CoreContainerProvider#init, which is calling ServiceHolder#setService. This > should work because CoreContainerProvider#init is always called before > SolrDispatchFilter#init (ServletContextListeners are initialized before > Filters). > But there's a problem: CoreContainerProvider#init stores the > ContextInitializationKey and the mapped ServiceHolder in > CoreContainerProvider#services, and that's a *WeakHashMap*: > {code:java} > services > .computeIfAbsent(new ContextInitializationKey(servletContext), > ServiceHolder::new) > .setService(this); > {code} > The key is not referenced anywhere else, which makes the mapping a candidate > for garbage collection. The ServiceHolder value also does not reference the > key anymore, because #setService cleared the reference. > With bad luck, the mapping is already gone from the WeakHashMap before > SolrDispatchFilter#init tries to retrieve it with > CoreContainerProvider#serviceForContext. And that method will then create a > new ContextInitializationKey and ServiceHolder, which is then used for > #waitForReadyService. But such a new ContextInitializationKey has never > received a #makeReady call, and #waitForReadyService will block forever. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17300) Copy existing listeners on re-creation of Http2SolrClient
[ https://issues.apache.org/jira/browse/SOLR-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848939#comment-17848939 ] David Smiley commented on SOLR-17300: - "Interrupted while obtaining reference to CoreService" – I am working on fixing this old very intermittent bug SOLR-17118 > Copy existing listeners on re-creation of Http2SolrClient > - > > Key: SOLR-17300 > URL: https://issues.apache.org/jira/browse/SOLR-17300 > Project: Solr > Issue Type: Sub-task >Reporter: Sanjay Dutt >Assignee: Sanjay Dutt >Priority: Major > Fix For: 9.7 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > For custom settings, such as timeouts, usually a Http2SolrClient is created > using the existing HTTP client using below code. > {code:java} > Http2SolrClient.Builder(leaderBaseUrl) > .withHttpClient(existingHttp2SolrClient) > .withIdleTimeout(soTimeout, TimeUnit.MILLISECONDS) > .withConnectionTimeout(connTimeout, TimeUnit.MILLISECONDS) > .build(); > {code} > If not specified, withHttpClient method would automatically copy over some of > the older configuration automatically to the new Http2SolrClient > {code:java} > if (this.basicAuthAuthorizationStr == null) { > this.basicAuthAuthorizationStr = > http2SolrClient.basicAuthAuthorizationStr; > } > if (this.followRedirects == null) { > this.followRedirects = http2SolrClient.httpClient.isFollowRedirects(); > } > if (this.idleTimeoutMillis == null) { > this.idleTimeoutMillis = http2SolrClient.idleTimeoutMillis; > } > if (this.requestWriter == null) { > this.requestWriter = http2SolrClient.requestWriter; > } > if (this.requestTimeoutMillis == null) { > this.requestTimeoutMillis = http2SolrClient.requestTimeoutMillis; > } > if (this.responseParser == null) { > this.responseParser = http2SolrClient.parser; > } > if (this.urlParamNames == null) { > this.urlParamNames = http2SolrClient.urlParamNames; > } > {code} > Nonetheless there is one field that did not pass over yet -- List of > HttpListenerFactory. This list also includes the interceptor for Auth due to > which re-created client were missing auth credentials and requests were > failing. > *Proposed Solution* :- > Along with other properties, List of Listener Factory should also be copied > over from old to new client using withHttpClient method. > {code:java} > if (this.listenerFactory == null) { > this.listenerFactory = new ArrayList(); > http2SolrClient.listenerFactory.forEach(this.listenerFactory::add); > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Assigned] (SOLR-17118) Solr deadlock during servlet container start
[ https://issues.apache.org/jira/browse/SOLR-17118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley reassigned SOLR-17118: --- Assignee: David Smiley > Solr deadlock during servlet container start > > > Key: SOLR-17118 > URL: https://issues.apache.org/jira/browse/SOLR-17118 > Project: Solr > Issue Type: Bug > Components: Server >Affects Versions: 9.2.1 >Reporter: Andreas Hubold >Assignee: David Smiley >Priority: Major > Labels: deadlock, servlet-context > Time Spent: 1h 40m > Remaining Estimate: 0h > > In rare cases, Solr can run into a deadlock when started. The servlet > container startup thread gets blocked and there's no other thread that could > unblock it: > {noformat} > "main" #1 prio=5 os_prio=0 cpu=5922.39ms elapsed=7490.27s > tid=0x7f637402ae70 nid=0x47 waiting on condition [0x7f6379488000] >java.lang.Thread.State: WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base@17.0.9/Native Method) > - parking to wait for <0x81da8000> (a > java.util.concurrent.CountDownLatch$Sync) > at java.util.concurrent.locks.LockSupport.park(java.base@17.0.9/Unknown > Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@17.0.9/Unknown > Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@17.0.9/Unknown > Source) > at java.util.concurrent.CountDownLatch.await(java.base@17.0.9/Unknown > Source) > at > org.apache.solr.servlet.CoreContainerProvider$ContextInitializationKey.waitForReadyService(CoreContainerProvider.java:523) > at > org.apache.solr.servlet.CoreContainerProvider$ServiceHolder.getService(CoreContainerProvider.java:562) > at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:148) > at > org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:133) > at > org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$2(ServletHandler.java:725) > at > org.eclipse.jetty.servlet.ServletHandler$$Lambda$315/0x7f62fc2674b8.accept(Unknown > Source) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(java.base@17.0.9/Unknown > Source) > at > java.util.stream.Streams$ConcatSpliterator.forEachRemaining(java.base@17.0.9/Unknown > Source) > at > java.util.stream.ReferencePipeline$Head.forEach(java.base@17.0.9/Unknown > Source) > at > org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:749) > at > org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:392) > > {noformat} > ContextInitializationKey.waitForReadyService should have been unblocked by > CoreContainerProvider#init, which is calling ServiceHolder#setService. This > should work because CoreContainerProvider#init is always called before > SolrDispatchFilter#init (ServletContextListeners are initialized before > Filters). > But there's a problem: CoreContainerProvider#init stores the > ContextInitializationKey and the mapped ServiceHolder in > CoreContainerProvider#services, and that's a *WeakHashMap*: > {code:java} > services > .computeIfAbsent(new ContextInitializationKey(servletContext), > ServiceHolder::new) > .setService(this); > {code} > The key is not referenced anywhere else, which makes the mapping a candidate > for garbage collection. The ServiceHolder value also does not reference the > key anymore, because #setService cleared the reference. > With bad luck, the mapping is already gone from the WeakHashMap before > SolrDispatchFilter#init tries to retrieve it with > CoreContainerProvider#serviceForContext. And that method will then create a > new ContextInitializationKey and ServiceHolder, which is then used for > #waitForReadyService. But such a new ContextInitializationKey has never > received a #makeReady call, and #waitForReadyService will block forever. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17007) TestDenseVectorFunctionQuery reproducible failures
[ https://issues.apache.org/jira/browse/SOLR-17007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848752#comment-17848752 ] David Smiley commented on SOLR-17007: - [GE|https://ge.apache.org/scans/tests?search.relativeStartTime=P90D=solr-root=America%2FNew_York=org.apache.solr.search.function.TestDenseVectorFunctionQuery] shows sporadic failures as well. Today [I got several test failures from this class|https://github.com/apache/solr/actions/runs/9186812798/job/25263204671?pr=2474] on my PR... {noformat} Caused by: java.lang.IllegalArgumentException: no byte vector value is indexed for field 'vector_byte_encoding' at org.apache.lucene.queries.function.valuesource.ByteKnnVectorFieldSource.getValues(ByteKnnVectorFieldSource.java:45) at org.apache.lucene.queries.function.valuesource.VectorSimilarityFunction.getValues(VectorSimilarityFunction.java:48) at org.apache.lucene.queries.function.FunctionQuery$AllScorer.(FunctionQuery.java:115) at org.apache.lucene.queries.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:76) at org.apache.lucene.search.Weight.scorerSupplier(Weight.java:135) {noformat} with a number of the tests failing on this class. > TestDenseVectorFunctionQuery reproducible failures > -- > > Key: SOLR-17007 > URL: https://issues.apache.org/jira/browse/SOLR-17007 > Project: Solr > Issue Type: Test >Reporter: Chris M. Hostetter >Priority: Major > Attachments: apache_solr_Solr-NightlyTests-main_928.log.txt, > apache_solr_Solr-NightlyTests-main_931.log.txt, > thetaphi_solr_Solr-main-Linux_14822.log.txt > > > In the past week, the same 5 test methods of TestDenseVectorFunctionQuery > have all failed 3 times - in the same 3 jenkins builds (ie: same master seed > - which reproduces locally for me) and all of the test (method) failures have > the same root cause ... strongly suggesting that some aspect of the static, > or test class level, randomization is breaking these methods. > > Recent example... > {noformat} > ./gradlew test --tests TestDenseVectorFunctionQuery > -Dtests.seed=749AD19AB618219E -Dtests.multiplier=2 -Dtests.nightly=true > -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Solr/Solr-NightlyTests-main/test-data/enwiki.random.lines.txt > -Dtests.locale=fr-MQ -Dtests.timezone=Asia/Novosibirsk -Dtests.asserts=true > -Dtests.file.encoding=UTF-8 > ... > org.apache.solr.search.function.TestDenseVectorFunctionQuery > > floatFieldVectors_missingFieldValue_shouldReturnSimilarityZero FAILED > java.lang.RuntimeException: Exception during query > at > __randomizedtesting.SeedInfo.seed([749AD19AB618219E:E0B29A3AECE5D888]:0) > at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:989) > at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:947) > at > org.apache.solr.search.function.TestDenseVectorFunctionQuery.floatFieldVectors_missingFieldValue_shouldReturnSimilarityZero(TestDenseVectorFunctionQuery.java:173) > ... > Caused by: > java.lang.IllegalArgumentException: no float vector value is indexed > for field 'vector2' > at > org.apache.lucene.queries.function.valuesource.FloatKnnVectorFieldSource.getValues(FloatKnnVectorFieldSource.java:45) > at > org.apache.lucene.queries.function.valuesource.VectorSimilarityFunction.getValues(VectorSimilarityFunction.java:48) > at > org.apache.lucene.queries.function.FunctionQuery$AllScorer.(FunctionQuery.java:115) > at > org.apache.lucene.queries.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:76) > at org.apache.lucene.search.Weight.scorerSupplier(Weight.java:135) > at > org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:515) > at org.apache.lucene.search.Weight.bulkScorer(Weight.java:165) > at > org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:368) > at > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:759) > at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:720) > at > org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:549) > at > org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:275) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1878) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1695) > at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:710) > at >
[jira] [Commented] (SOLR-17118) Solr deadlock during servlet container start
[ https://issues.apache.org/jira/browse/SOLR-17118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848376#comment-17848376 ] David Smiley commented on SOLR-17118: - I've been stumped a number of times over the years looking at an unreproducible test failure that shows SolrDispatchFilter.init calling CoreContainerProvider.waitForReadyService and waiting on a CoundDownLatch indefinitely. Meanwhile, no other trace or problems in the logs. Eventually the test will timeout and we see a thread dump. I suspect a timing bug of exactly when GC happens interplaying with the use of WeakHashMap. In particular I see ContextInitializationKey's constructor publishing "this" to the ServletContext which seems like a bad place to put such logic (constructors publishing themselves is suspicious in general; avoid it). But the point is that it'll overwrite an existing entry in the context that may very well be there, thus suddenly making an existing entry in a WeakHashMap weakly reachable and it may be removed. {*}There is too much complexity there{*}; I think it should be overhauled a bit. > Solr deadlock during servlet container start > > > Key: SOLR-17118 > URL: https://issues.apache.org/jira/browse/SOLR-17118 > Project: Solr > Issue Type: Bug > Components: Server >Affects Versions: 9.2.1 >Reporter: Andreas Hubold >Priority: Major > Labels: deadlock, servlet-context > > In rare cases, Solr can run into a deadlock when started. The servlet > container startup thread gets blocked and there's no other thread that could > unblock it: > {noformat} > "main" #1 prio=5 os_prio=0 cpu=5922.39ms elapsed=7490.27s > tid=0x7f637402ae70 nid=0x47 waiting on condition [0x7f6379488000] >java.lang.Thread.State: WAITING (parking) > at jdk.internal.misc.Unsafe.park(java.base@17.0.9/Native Method) > - parking to wait for <0x81da8000> (a > java.util.concurrent.CountDownLatch$Sync) > at java.util.concurrent.locks.LockSupport.park(java.base@17.0.9/Unknown > Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@17.0.9/Unknown > Source) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(java.base@17.0.9/Unknown > Source) > at java.util.concurrent.CountDownLatch.await(java.base@17.0.9/Unknown > Source) > at > org.apache.solr.servlet.CoreContainerProvider$ContextInitializationKey.waitForReadyService(CoreContainerProvider.java:523) > at > org.apache.solr.servlet.CoreContainerProvider$ServiceHolder.getService(CoreContainerProvider.java:562) > at > org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:148) > at > org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:133) > at > org.eclipse.jetty.servlet.ServletHandler.lambda$initialize$2(ServletHandler.java:725) > at > org.eclipse.jetty.servlet.ServletHandler$$Lambda$315/0x7f62fc2674b8.accept(Unknown > Source) > at > java.util.ArrayList$ArrayListSpliterator.forEachRemaining(java.base@17.0.9/Unknown > Source) > at > java.util.stream.Streams$ConcatSpliterator.forEachRemaining(java.base@17.0.9/Unknown > Source) > at > java.util.stream.ReferencePipeline$Head.forEach(java.base@17.0.9/Unknown > Source) > at > org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:749) > at > org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:392) > > {noformat} > ContextInitializationKey.waitForReadyService should have been unblocked by > CoreContainerProvider#init, which is calling ServiceHolder#setService. This > should work because CoreContainerProvider#init is always called before > SolrDispatchFilter#init (ServletContextListeners are initialized before > Filters). > But there's a problem: CoreContainerProvider#init stores the > ContextInitializationKey and the mapped ServiceHolder in > CoreContainerProvider#services, and that's a *WeakHashMap*: > {code:java} > services > .computeIfAbsent(new ContextInitializationKey(servletContext), > ServiceHolder::new) > .setService(this); > {code} > The key is not referenced anywhere else, which makes the mapping a candidate > for garbage collection. The ServiceHolder value also does not reference the > key anymore, because #setService cleared the reference. > With bad luck, the mapping is already gone from the WeakHashMap before > SolrDispatchFilter#init tries to retrieve it with > CoreContainerProvider#serviceForContext. And that method will then create a > new ContextInitializationKey and ServiceHolder, which is then used for > #waitForReadyService. But such a new ContextInitializationKey has never > received a #makeReady call, and
[jira] [Updated] (SOLR-17305) SolrDispatchFilter CoreContainerProvider can hang on startup
[ https://issues.apache.org/jira/browse/SOLR-17305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17305: Fix Version/s: (was: SOLR-15590) > SolrDispatchFilter CoreContainerProvider can hang on startup > > > Key: SOLR-17305 > URL: https://issues.apache.org/jira/browse/SOLR-17305 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > > I've been stumped a number of times over the years looking at an > unreproducible test failure that shows SolrDispatchFilter.init calling > CoreContainerProvider.waitForReadyService and waiting on a CoundDownLatch > indefinitely. Meanwhile, no other trace or problems in the logs. Eventually > the test will timeout and we see a thread dump. > I suspect a timing bug of exactly when GC happens interplaying with the use > of WeakHashMap. In particular I see ContextInitializationKey's constructor > publishing "this" to the ServletContext which seems like a bad place to put > such logic (constructors publishing themselves is suspicious in general; > avoid it). But the point is that it'll overwrite an existing entry in the > context that may very well be there, thus suddenly making an existing entry > in a WeakHashMap weakly reachable and it may be removed. There is too much > complexity there; I think it should be overhauled a bit. I'm working on a PR. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17305) SolrDispatchFilter CoreContainerProvider can hang on startup
[ https://issues.apache.org/jira/browse/SOLR-17305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17305. - Fix Version/s: SOLR-15590 Resolution: Duplicate > SolrDispatchFilter CoreContainerProvider can hang on startup > > > Key: SOLR-17305 > URL: https://issues.apache.org/jira/browse/SOLR-17305 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Fix For: SOLR-15590 > > > I've been stumped a number of times over the years looking at an > unreproducible test failure that shows SolrDispatchFilter.init calling > CoreContainerProvider.waitForReadyService and waiting on a CoundDownLatch > indefinitely. Meanwhile, no other trace or problems in the logs. Eventually > the test will timeout and we see a thread dump. > I suspect a timing bug of exactly when GC happens interplaying with the use > of WeakHashMap. In particular I see ContextInitializationKey's constructor > publishing "this" to the ServletContext which seems like a bad place to put > such logic (constructors publishing themselves is suspicious in general; > avoid it). But the point is that it'll overwrite an existing entry in the > context that may very well be there, thus suddenly making an existing entry > in a WeakHashMap weakly reachable and it may be removed. There is too much > complexity there; I think it should be overhauled a bit. I'm working on a PR. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17305) SolrDispatchFilter CoreContainerProvider can hang on startup
[ https://issues.apache.org/jira/browse/SOLR-17305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848337#comment-17848337 ] David Smiley commented on SOLR-17305: - Stack trace of the blocked thread: {noformat} at java.base@11.0.23/jdk.internal.misc.Unsafe.park(Native Method) 2> at java.base@11.0.23/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194) 2> at java.base@11.0.23/java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:885) 2> at java.base@11.0.23/java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1039) 2> at java.base@11.0.23/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1345) 2> at java.base@11.0.23/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:232) 2> at app//org.apache.solr.servlet.CoreContainerProvider$ContextInitializationKey.waitForReadyService(CoreContainerProvider.java:525) 2> at app//org.apache.solr.servlet.CoreContainerProvider$ServiceHolder.getService(CoreContainerProvider.java:564) 2> at app//org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:144) 2> at app//org.eclipse.jetty.servlet.FilterHolder.initialize(FilterHolder.java:133) 2> at app//org.eclipse.jetty.servlet.ServletHandler.initializeHolders(ServletHandler.java:774) 2> at app//org.eclipse.jetty.servlet.ServletHandler.setFilters(ServletHandler.java:1472) 2> at app//org.eclipse.jetty.servlet.ServletHandler.addFilterWithMapping(ServletHandler.java:992) 2> at app//org.eclipse.jetty.servlet.ServletContextHandler.addFilter(ServletContextHandler.java:480) 2> at app//org.apache.solr.embedded.JettySolrRunner$1.lifeCycleStarted(JettySolrRunner.java:427) 2> at app//org.eclipse.jetty.util.component.AbstractLifeCycle.setStarted(AbstractLifeCycle.java:253) 2> at app//org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:94) 2> at app//org.apache.solr.embedded.JettySolrRunner.retryOnPortBindFailure(JettySolrRunner.java:616) 2> at app//org.apache.solr.embedded.JettySolrRunner.start(JettySolrRunner.java:554) 2> at app//org.apache.solr.embedded.JettySolrRunner.start(JettySolrRunner.java:525) 2> at app//org.apache.solr.util.SolrJettyTestRule.startSolr(SolrJettyTestRule.java:91) 2> at app//org.apache.solr.SolrJettyTestBase.createAndStartJetty(SolrJettyTestBase.java:95) 2> at app//org.apache.solr.SolrJettyTestBase.createAndStartJetty(SolrJettyTestBase.java:60) 2> at app//org.apache.solr.util.RestTestBase.createJettyAndHarness(RestTestBase.java:57) 2> at app//org.apache.solr.ltr.TestRerankBase.setuptest(TestRerankBase.java:181) 2> at app//org.apache.solr.ltr.TestRerankBase.setuptest(TestRerankBase.java:113) 2> at app//org.apache.solr.ltr.response.transform.TestInterleavingTransformer.before(TestInterleavingTransformer.java:32) {noformat} > SolrDispatchFilter CoreContainerProvider can hang on startup > > > Key: SOLR-17305 > URL: https://issues.apache.org/jira/browse/SOLR-17305 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > > I've been stumped a number of times over the years looking at an > unreproducible test failure that shows SolrDispatchFilter.init calling > CoreContainerProvider.waitForReadyService and waiting on a CoundDownLatch > indefinitely. Meanwhile, no other trace or problems in the logs. Eventually > the test will timeout and we see a thread dump. > I suspect a timing bug of exactly when GC happens interplaying with the use > of WeakHashMap. In particular I see ContextInitializationKey's constructor > publishing "this" to the ServletContext which seems like a bad place to put > such logic (constructors publishing themselves is suspicious in general; > avoid it). But the point is that it'll overwrite an existing entry in the > context that may very well be there, thus suddenly making an existing entry > in a WeakHashMap weakly reachable and it may be removed. There is too much > complexity there; I think it should be overhauled a bit. I'm working on a PR. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17305) SolrDispatchFilter CoreContainerProvider can hang on startup
David Smiley created SOLR-17305: --- Summary: SolrDispatchFilter CoreContainerProvider can hang on startup Key: SOLR-17305 URL: https://issues.apache.org/jira/browse/SOLR-17305 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: David Smiley Assignee: David Smiley I've been stumped a number of times over the years looking at an unreproducible test failure that shows SolrDispatchFilter.init calling CoreContainerProvider.waitForReadyService and waiting on a CoundDownLatch indefinitely. Meanwhile, no other trace or problems in the logs. Eventually the test will timeout and we see a thread dump. I suspect a timing bug of exactly when GC happens interplaying with the use of WeakHashMap. In particular I see ContextInitializationKey's constructor publishing "this" to the ServletContext which seems like a bad place to put such logic (constructors publishing themselves is suspicious in general; avoid it). But the point is that it'll overwrite an existing entry in the context that may very well be there, thus suddenly making an existing entry in a WeakHashMap weakly reachable and it may be removed. There is too much complexity there; I think it should be overhauled a bit. I'm working on a PR. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Comment Edited] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848325#comment-17848325 ] David Smiley edited comment on SOLR-10654 at 5/21/24 6:26 PM: -- A possible way to have it both ways would be to +embed+ the Prometheus Exporter _partially_, without the caching aspect -- it'd be a request handler that fetches metrics locally (would talk to MetricsHandler in a direct way) and post-processes via JQ. I don't love JQ but... hey, some do. XSLT/XQuery is more my thing and supports streaming. No new dependencies to add directly to Solr; people would just add the Exporter's as if it's a module. Regardless of some details, there would still be *some* overhead in this post processing due to JQ. I'm not sure that's the pain point we're solving for here? I haven't measured lately. It could be interesting to compare the performance of the Prometheus Exporter and this patch. was (Author: dsmiley): A possible way to have it both ways would be to +embed+ the Prometheus Exporter _partially_, without the caching aspect -- it'd be a request handler that fetches metrics locally (would talk to MetricsHandler in a direct way) and post-processes via JQ. I don't love JQ but... hey, some do. XSLT/XQuery is more my thing. No new dependencies to add directly to Solr; people would just add the Exporter's as if it's a module. Regardless of some details, there would still be *some* overhead in this post processing due to JQ. I'm not sure that's the pain point we're solving for here? I haven't measured lately. It could be interesting to compare the performance of the Prometheus Exporter and this patch. > Expose Metrics in Prometheus format DIRECTLY from Solr > -- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Keith Laban >Priority: Major > Attachments: prometheus_metrics.txt > > Time Spent: 3h > Remaining Estimate: 0h > > Expose metrics via a `wt=prometheus` response type. > Example scape_config in prometheus.yml: > {code} > scrape_configs: > - job_name: 'solr' > metrics_path: '/solr/admin/metrics' > params: > wt: ["prometheus"] > static_configs: > - targets: ['localhost:8983'] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848325#comment-17848325 ] David Smiley commented on SOLR-10654: - A possible way to have it both ways would be to +embed+ the Prometheus Exporter _partially_, without the caching aspect -- it'd be a request handler that fetches metrics locally (would talk to MetricsHandler in a direct way) and post-processes via JQ. I don't love JQ but... hey, some do. XSLT/XQuery is more my thing. No new dependencies to add directly to Solr; people would just add the Exporter's as if it's a module. Regardless of some details, there would still be *some* overhead in this post processing due to JQ. I'm not sure that's the pain point we're solving for here? I haven't measured lately. It could be interesting to compare the performance of the Prometheus Exporter and this patch. > Expose Metrics in Prometheus format DIRECTLY from Solr > -- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Keith Laban >Priority: Major > Attachments: prometheus_metrics.txt > > Time Spent: 3h > Remaining Estimate: 0h > > Expose metrics via a `wt=prometheus` response type. > Example scape_config in prometheus.yml: > {code} > scrape_configs: > - job_name: 'solr' > metrics_path: '/solr/admin/metrics' > params: > wt: ["prometheus"] > static_configs: > - targets: ['localhost:8983'] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17303) CVE-2023-39410: Upgrade to apache-avro version 1.11.3
[ https://issues.apache.org/jira/browse/SOLR-17303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17303. - Resolution: Invalid Solr doesn't use Avro. > CVE-2023-39410: Upgrade to apache-avro version 1.11.3 > - > > Key: SOLR-17303 > URL: https://issues.apache.org/jira/browse/SOLR-17303 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: security >Affects Versions: 9.6 >Reporter: Sujeet Hinge >Priority: Major > > CVE-2023-39410: Upgrade Apache-Avro version to 1.11.3 > When deserializing untrusted or corrupted data, it is possible for a reader > to consume memory beyond the allowed constraints and thus lead to out of > memory on the system. This issue affects Java applications using Apache Avro > Java SDK up to and including 1.11.2. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17300) Copy existing listeners on re-creation of Http2SolrClient
[ https://issues.apache.org/jira/browse/SOLR-17300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848258#comment-17848258 ] David Smiley commented on SOLR-17300: - Nicely done. The next step is to back-port to branch_9x. There's a {{dev-tools/cherrypick.sh}} script to make this easier. Once that is done, you can "Resolve" (but not "Close") this JIRA issue and choose the next minor release as the fix-version -- 9.7.0. > Copy existing listeners on re-creation of Http2SolrClient > - > > Key: SOLR-17300 > URL: https://issues.apache.org/jira/browse/SOLR-17300 > Project: Solr > Issue Type: Sub-task >Reporter: Sanjay Dutt >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > For custom settings, such as timeouts, usually a Http2SolrClient is created > using the existing HTTP client using below code. > {code:java} > Http2SolrClient.Builder(leaderBaseUrl) > .withHttpClient(existingHttp2SolrClient) > .withIdleTimeout(soTimeout, TimeUnit.MILLISECONDS) > .withConnectionTimeout(connTimeout, TimeUnit.MILLISECONDS) > .build(); > {code} > If not specified, withHttpClient method would automatically copy over some of > the older configuration automatically to the new Http2SolrClient > {code:java} > if (this.basicAuthAuthorizationStr == null) { > this.basicAuthAuthorizationStr = > http2SolrClient.basicAuthAuthorizationStr; > } > if (this.followRedirects == null) { > this.followRedirects = http2SolrClient.httpClient.isFollowRedirects(); > } > if (this.idleTimeoutMillis == null) { > this.idleTimeoutMillis = http2SolrClient.idleTimeoutMillis; > } > if (this.requestWriter == null) { > this.requestWriter = http2SolrClient.requestWriter; > } > if (this.requestTimeoutMillis == null) { > this.requestTimeoutMillis = http2SolrClient.requestTimeoutMillis; > } > if (this.responseParser == null) { > this.responseParser = http2SolrClient.parser; > } > if (this.urlParamNames == null) { > this.urlParamNames = http2SolrClient.urlParamNames; > } > {code} > Nonetheless there is one field that did not pass over yet -- List of > HttpListenerFactory. This list also includes the interceptor for Auth due to > which re-created client were missing auth credentials and requests were > failing. > *Proposed Solution* :- > Along with other properties, List of Listener Factory should also be copied > over from old to new client using withHttpClient method. > {code:java} > if (this.listenerFactory == null) { > this.listenerFactory = new ArrayList(); > http2SolrClient.listenerFactory.forEach(this.listenerFactory::add); > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17297) Classloading issue with plugin and modules
[ https://issues.apache.org/jira/browse/SOLR-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848000#comment-17848000 ] David Smiley commented on SOLR-17297: - I believe it's only one classloader across modules & shardLib. On the other hand, "packages" are isolated. If one day modules get classloader isolation, then I think we would have to invoke the Lucene SPI stuff for each separate classloader. > Classloading issue with plugin and modules > -- > > Key: SOLR-17297 > URL: https://issues.apache.org/jira/browse/SOLR-17297 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 9.3 >Reporter: Patson Luk >Priority: Critical > > h2. Summary > Using plugin jar and enabling any modules could trigger > {{java.lang.ClassNotFoundException}} > h2. Description > 1. An implementation of {{org.apache.lucene.codecs.PostingsFormat}} with the > jar within the /lib > 2. Enable modules in solr.xml for example name="modules">opentelemetry > 3. Now on startup. As a part of {{NodeConfig#setupSharedLib}}, it would load > all the SPIs, it locates our jar and loads the class with a > {{FactoryURLClassLoader}} with the classpaths point at the jar of the lib, > which is correct > 4. After {{NodeConfig#setupSharedLib}}, {{NodeConfig#initModules}} is > invoked, which eventually calls {{SolrResourceLoader#addURLsToClassLoader}} > that closes the previous class loader, which is the one used in 3. > 5. Now a core is loaded with that codec, it runs the code which > references other classes within our plugin jar, but unfortunately it would > use the Classloader that loads our class in step 3., and such loader is > marked as "closed" hence no longer load the correct resource/class. This > triggers ClassNotFoundException. > I have tried several things, the only thing that seems to work so far is > commenting out {{IOUtils.closeWhileHandlingException(oldLoader);}} in > {{SolrResourceLoader#addURLsToClassLoader}}, which is likely not the right > workaround as the {{closeWhileHandlingException}} should be there for a > reason ;) > Switching {{setupSharedLib}} and {{initModules}} might work too (haven't > tested), but I don't want to try any weird changes since I don't really know > the ordering significance. > Would appreciate some helps from the Solr experts! :) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847998#comment-17847998 ] David Smiley commented on SOLR-10654: - Overall I wonder what others think about maintaining two parallel yet consistent metrics mappings -- one in the "Prometheus Exporter" configured using lots of "jq" and that which is very flexible (intended for users to configure/hack as needed), the second what this PR does, basically as hard-coded as can be. For example if we add a new metric, we then probably need to update the exporter's config, and also edit source code being added here. This could be helped by having the Prometheus Exporter fetch certain metrics pass-through on-demand. But based on the design of the Prometheus Exporter, I think that could be tricky/awkward. > Expose Metrics in Prometheus format DIRECTLY from Solr > -- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Keith Laban >Priority: Major > Attachments: prometheus_metrics.txt > > Time Spent: 3h > Remaining Estimate: 0h > > Expose metrics via a `wt=prometheus` response type. > Example scape_config in prometheus.yml: > {code} > scrape_configs: > - job_name: 'solr' > metrics_path: '/solr/admin/metrics' > params: > wt: ["prometheus"] > static_configs: > - targets: ['localhost:8983'] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17296) rerank w/scaling (still) broken when using debug to get explain info
[ https://issues.apache.org/jira/browse/SOLR-17296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847402#comment-17847402 ] David Smiley commented on SOLR-17296: - I just wish to +1 Hossman's sentiments. Toggling debugging should not change the semantics! > rerank w/scaling (still) broken when using debug to get explain info > > > Key: SOLR-17296 > URL: https://issues.apache.org/jira/browse/SOLR-17296 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 9.4 >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-17296.test.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The changes made in SOLR-16931 (9.4) attempted to work around problems that > existed when attempting to enable degugging (to get score explanations) in > combination with using {{reRankScale}} ... > {quote}The reason for this is that in order to do proper explain for > minMaxScaling you need to know the min and max score in the result set. This > piece of state is maintained in the ReRankScaler itself which is inside of > the ReRankQuery. But for this information to be populated the query must > first be run. In distributed mode, explain is called in the second pass when > the ids query is run so the state needed for the explain is not populated. ... > {quote} > However, the solution attempted was incomplete and failed to account for > multiple factors... > {quote}... The PR attached to this addresses this problem by doing a single > pass distributed query if debugQuery is turned on and if reRank score scaling > is applied. I'll add a distributed test for this as well. > This change is very limited in scope because the single pass distributed is > only switched on in the very specific case when debugQuery=true and > reRankScaling is on. > {quote} > > * NPEs are still possible... > ** Instead of checking for {{ResponseBuilder.isDebugResults()}} (which is > what triggers explain logic) the new code only checked for specific debug > request param combinations: > *** {{debuQuery=true}} (a legacy option intended only for backcompat) > * > ** > *** {{debug=true}} (intended as an alias for {{debug=all}} > ** It did not check for either of these options, which if used will still > trigger an NPE... > *** {{debug=results}} (which actually dictates the value of > {{ResponseBuilder.isDebugResults()}} > *** {{debug=all}} (a short hand for setting all debug options) > * the attempt to force a single pass query didn't modify the correct variable > ** The new code modified a conditional based on a {{boolean > distribSinglePass}} for setting {{sreq.purpose}} and > {{rb.onePassDistributedQuery}} > ** But it did not modify the value of the {{boolean distribSinglePass}} > itself - meaning other logic that uses that variable in that method still > assumes multiple passes will be used. > ** In particular, these means that even though a single pass is used for > both {{PURPOSE_GET_TOP_IDS}} and {{PURPOSE_GET_FIELDS}} the full {{"fl"}} > requested by the user is not propagated as part of this request > *** Only the uniqueKey and any sot fields are ultimately returned to the > user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17297) Classloading issue with plugin and modules
[ https://issues.apache.org/jira/browse/SOLR-17297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847147#comment-17847147 ] David Smiley commented on SOLR-17297: - Interesting. SolrResourceLoader.addToClassLoader has a clear contract in its javadocs that it MUST only be called prior to using to get any resources. Yet I suspect the underlying class loader here has already loaded stuff, based on your description. bq. Switching setupSharedLib and initModules might work too (haven't tested) My thoughts exactly. Try that :-) FWIW where I work, our Lucene PostingsFormat/Codec level stuff goes into WEB-INF/lib because we ran into issues with it in sharedLib. I'm glad to see you've looked closely at the matter. > Classloading issue with plugin and modules > -- > > Key: SOLR-17297 > URL: https://issues.apache.org/jira/browse/SOLR-17297 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 9.3 >Reporter: Patson Luk >Priority: Critical > > h2. Summary > Using plugin jar and enabling any modules could trigger > {{java.lang.ClassNotFoundException}} > h2. Description > 1. An implementation of {{org.apache.lucene.codecs.PostingsFormat}} with the > jar within the /lib > 2. Enable modules in solr.xml for example name="modules">opentelemetry > 3. Now on startup. As a part of {{NodeConfig#setupSharedLib}}, it would load > all the SPIs, it locates our jar and loads the class with a > {{FactoryURLClassLoader}} with the classpaths point at the jar of the lib, > which is correct > 4. After {{NodeConfig#setupSharedLib}}, {{NodeConfig#initModules}} is > invoked, which eventually calls {{SolrResourceLoader#addURLsToClassLoader}} > that closes the previous class loader, which is the one used in 3. > 5. Now a core is loaded with that codec, it runs the code which > references other classes within our plugin jar, but unfortunately it would > use the Classloader that loads our class in step 3., and such loader is > marked as "closed" hence no longer load the correct resource/class. This > triggers ClassNotFoundException. > I have tried several things, the only thing that seems to work so far is > commenting out {{IOUtils.closeWhileHandlingException(oldLoader);}} in > {{SolrResourceLoader#addURLsToClassLoader}}, which is likely not the right > workaround as the {{closeWhileHandlingException}} should be there for a > reason ;) > Switching {{setupSharedLib}} and {{initModules}} might work too (haven't > tested), but I don't want to try any weird changes since I don't really know > the ordering significance. > Would appreciate some helps from the Solr experts! :) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17275. - Resolution: Fixed > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Blocker > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > Time Spent: 1h > Remaining Estimate: 0h > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns > null|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L151]. > In 9.5.0 it wasn't the case (forceUpdateCollection was not called in this > place). I can see in the debugger that getCollectionRef only supports > collections and not aliases (collectionStates map contains only collections). > In my application all collections are referenced using aliases so I guess > that's why I can see the regression in Solr response time. > I am not familiar with the code enough to prepare a PR but I hope this > insight will be enough to fix this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846758#comment-17846758 ] David Smiley commented on SOLR-17275: - I'll merge the change tomorrow morning to ensure anyone else has an opportunity to review. It's surprising to see that forceUpdateCollection has this global synchronization lock, and that it was *this*, not some extra ZK call, that is the root of the problem. > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Blocker > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > Time Spent: 20m > Remaining Estimate: 0h > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns > null|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L151]. > In 9.5.0 it wasn't the case (forceUpdateCollection was not called in this > place). I can see in the debugger that getCollectionRef only supports > collections and not aliases (collectionStates map contains only collections). > In my application all collections are referenced using aliases so I guess > that's why I can see the regression in Solr response time. > I am not familiar with the code enough to prepare a PR but I hope this > insight will be enough to fix this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17263) HttpJdkSolrClient doesn't encode curly braces etc
[ https://issues.apache.org/jira/browse/SOLR-17263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846638#comment-17846638 ] David Smiley commented on SOLR-17263: - I'm a bit confused by all these PRs but I merged the first one to main, branch_9x, and branch_9_6 as the comments above show. Maybe someone else can help out with the others. > HttpJdkSolrClient doesn't encode curly braces etc > - > > Key: SOLR-17263 > URL: https://issues.apache.org/jira/browse/SOLR-17263 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 >Reporter: Andy Webb >Priority: Major > Fix For: 9.6.1 > > Time Spent: 2h > Remaining Estimate: 0h > > Ref > https://issues.apache.org/jira/browse/SOLR-599?focusedCommentId=17842429#comment-17842429 > - {{HttpJdkSolrClient}} should use {{{}SolrParams{}}}' {{toQueryString()}} > method when constructing URLs to that all URL-unsafe characters are encoded. > It's implicitly using the {{toString()}} method currently which is intended > for logging etc purposes. > Attempting to use alternate query parsers in requests as shown below will > currently fail as the curly braces aren't encoded. > {noformat} > myquery.set("fq", "{!terms f=myfield}value1,value2"); {noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Reopened] (SOLR-16505) Switch UpdateShardHandler.getRecoveryOnlyHttpClient to Jetty HTTP2
[ https://issues.apache.org/jira/browse/SOLR-16505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley reopened SOLR-16505: - Assignee: David Smiley > Switch UpdateShardHandler.getRecoveryOnlyHttpClient to Jetty HTTP2 > -- > > Key: SOLR-16505 > URL: https://issues.apache.org/jira/browse/SOLR-16505 > Project: Solr > Issue Type: Sub-task >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Fix For: 9.7 > > Time Spent: 8.5h > Remaining Estimate: 0h > > This method and its callers (only RecoveryStrategy) should be converted to a > Jetty HTTP2 client. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17293) Umbrella: Decentralized Cluster Processing as default
David Smiley created SOLR-17293: --- Summary: Umbrella: Decentralized Cluster Processing as default Key: SOLR-17293 URL: https://issues.apache.org/jira/browse/SOLR-17293 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: SolrCloud Reporter: David Smiley This is an umbrella issue for tracking work required for running SolrCloud with two booleans as default: {{distributedClusterStateUpdates}} and {{distributedCollectionConfigSetExecution}}, which we may rename/refactor (TBD). When they are set, the Overseer has nothing to do except run "Cluster Singleton Plugins" (if you configure any). These have been in Solr for years since well before 9.0 and are tested in a randomized fashion. But they have not experienced real-world usage to our knowledge. There are some scalability concerns, and unclear compatibility with PRS. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17292) PerReplicaStatesOps.persist should propagate failure
David Smiley created SOLR-17292: --- Summary: PerReplicaStatesOps.persist should propagate failure Key: SOLR-17292 URL: https://issues.apache.org/jira/browse/SOLR-17292 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: SolrCloud Reporter: David Smiley [PerReplicaStatesOps.persist|https://github.com/apache/solr/blob/f22a51cc64f83f7b1268d9f3a4c50e36249bdd87/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/PerReplicaStatesOps.java#L136] has a retry loop but if the number of tries are exceeded, the method returns and does nothing! The correct behavior should be to throw the relevant exception. _(Reporting this on behalf of [~ilan])_ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17275: Priority: Blocker (was: Major) > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Blocker > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns > null|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L151]. > In 9.5.0 it wasn't the case (forceUpdateCollection was not called in this > place). I can see in the debugger that getCollectionRef only supports > collections and not aliases (collectionStates map contains only collections). > In my application all collections are referenced using aliases so I guess > that's why I can see the regression in Solr response time. > I am not familiar with the code enough to prepare a PR but I hope this > insight will be enough to fix this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13131) Category Routed Aliases
[ https://issues.apache.org/jira/browse/SOLR-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846390#comment-17846390 ] David Smiley commented on SOLR-13131: - [~gus] This issue introduced CoreContainer.runAsync using a new coreContainerAsyncExecutor. Comments, years later: * why not use the existing ExecutorService coreContainerWorkExecutor ? I don't think we need yet another executor for misc things. * why not simply return a getter for an ExecutorService instead of implementing runAsync given that it's not uncommon to want an ExecutorService? * runAsync makes it too easy to just use this one. Solr has a number ExecutorService's running around intended for different uses (e.g. "updates" (also includes misc. admin stuff BTW), "recovery"), so I'd rather have devs more easily see that we have multiple to choose from, and then they use the ExecutorService directly. > Category Routed Aliases > --- > > Key: SOLR-13131 > URL: https://issues.apache.org/jira/browse/SOLR-13131 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 9.0 >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Major > Fix For: 8.1, 9.0 > > Attachments: indexingWithCRA.png, indexingwithoutCRA.png, > indexintWithoutCRA2.png > > > This ticket is to add a second type of routed alias in addition to the > current time routed aliases. The new type of alias will allow data driven > creation of collections based on the values of a field and automated > organization of these collections under an alias that allows the collections > to also be searched as a whole. > The use case in mind at present is an IOT device type segregation, but I > could also see this leading to the ability to direct updates to tenant > specific hardware (in cooperation with autoscaling). > This ticket also looks forward to (but does not include) the creation of a > Dimensionally Routed Alias which would allow organizing time routed data also > segregated by device > Further design details to be added in comments. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-3443) Optimize hunspell dictionary loading with multiple cores
[ https://issues.apache.org/jira/browse/SOLR-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846298#comment-17846298 ] David Smiley commented on SOLR-3443: Where I work, we have both massive resources in our schema and also thousands of cores per node. The memory (and time to load a schema) problem is easily solved with [shareSchema in solr.xml|https://solr.apache.org/guide/solr/latest/configuration-guide/configuring-solr-xml.html] instead of narrow point-solutions for specific resources. > Optimize hunspell dictionary loading with multiple cores > > > Key: SOLR-3443 > URL: https://issues.apache.org/jira/browse/SOLR-3443 > Project: Solr > Issue Type: Improvement >Reporter: Luca Cavanna >Priority: Major > Attachments: SOLR-3443.patch, Screen Shot 2015-11-29 at 9.52.06 AM.png > > > The Hunspell dictionary is actually loaded into memory. Each core using > hunspell loads its own dictionary, no matter if all the cores are using the > same dictionary files. As a result, the same dictionary is loaded into memory > multiple times, once for each core. I think we should share those > dictionaries between all cores in order to optimize the memory usage. In > fact, let's say a dictionary takes 20MB into memory (this is what I > detected), if you have 20 cores you are going to use 400MB only for > dictionaries, which doesn't seem a good idea to me. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-16505) Switch UpdateShardHandler.getRecoveryOnlyHttpClient to Jetty HTTP2
[ https://issues.apache.org/jira/browse/SOLR-16505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-16505. - Fix Version/s: 9.7 Resolution: Fixed Thanks for contributing to this issue Sanjay! Definitely was a more interesting journey than it seemed at the outset LOL. > Switch UpdateShardHandler.getRecoveryOnlyHttpClient to Jetty HTTP2 > -- > > Key: SOLR-16505 > URL: https://issues.apache.org/jira/browse/SOLR-16505 > Project: Solr > Issue Type: Sub-task >Reporter: David Smiley >Priority: Major > Fix For: 9.7 > > Time Spent: 8.5h > Remaining Estimate: 0h > > This method and its callers (only RecoveryStrategy) should be converted to a > Jetty HTTP2 client. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17274) atomic update error when using json w/ multiple modifiers
[ https://issues.apache.org/jira/browse/SOLR-17274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17274. - Fix Version/s: 9.7 Resolution: Fixed Merged. Thanks for the PR Calvin! > atomic update error when using json w/ multiple modifiers > - > > Key: SOLR-17274 > URL: https://issues.apache.org/jira/browse/SOLR-17274 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 9.0, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6 > Environment: Observed first in 9.5.0 w/ OpenJDK 11, then reproduced > this simple test case in 9.6.0. >Reporter: Calvin Smith >Priority: Major > Fix For: 9.7 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > I ran into a problem doing a json atomic update that tries to both `add` and > `remove` a value for a multivalued field in a single update. I saw it > initially in an instance that runs 9.5.0, and reproduced a minimal example > using Solr 9.6.0. > {{The only fields defined in the schema are:}} > {code:java} > required="true" stored="true"/> > > {code} > {{`id` is also present, so I'm supplying docs with > just an `id` and a multivalued `name` field. The real setup is more complex, > but this is a minimal test case to illustrate the problem.}} > {{Starting with an empty index, I add the following doc to the index > successfully:}} > {code:java} > {"id": "1", "name": ["John Doe", "Jane Doe"]}{code} > {{And I can query it, seeing the expected result:}} > {code:java} > { > "responseHeader":{ > "status":0, > "QTime":23, > "params":{ > "q":"name:*" > } > }, > "response":{ > "numFound":1, > "start":0, > "numFoundExact":true, > "docs":[{ > "id":"1", > "name":["John Doe","Jane Doe"], > "_version_":1797873599884820480 > }] > } > }{code} > {{Next, I send an atomic update to modify the `name` field of that document > by removing `Jane Doe` and adding `Janet Doe`:}} > {code:java} > {"id": "1", "name": {"add": "Janet Doe", "remove": "Jane Doe"}}{code} > {{This atomic update that does both an `add` and a `remove` is something that > used to work for us under Solr 6.6, but we just noticed that it fails in our > production 9.5 instance and in 9.6, which I just downloaded to test against.}} > {{The error in the solr.log indicates that Solr mistakenly interprets the > `\{"add": "Janet Doe", "remove": "Jane Doe"}` as a nested document and then > throws an exception because our schema doesn't have the `{_}root{_}` field > that would be expected if we were using nested docs (which we don't use).}} > {{Here's the full stack trace from `solr.log` (version 9.6.0):}} > {noformat} > 2024-05-01 17:49:02.479 ERROR (qtp2059461664-30-0.0.0.0-3) [c: s: r: x:atris > t:0.0.0.0-3] o.a.s.h.RequestHandlerBase Client exception => > org.apache.solr.common.SolrException: Unable to index docs with children: the > schema must include definitions for both a uniqueKey field and the '_root_' > field, using the exact same fieldType > at > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) > org.apache.solr.common.SolrException: Unable to index docs with children: the > schema must include definitions for both a uniqueKey field and the '_root_' > field, using the exact same fieldType > at > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) > ~[?:?] > at > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:153) > ~[?:?] > at > org.apache.solr.update.AddUpdateCommand.makeLuceneDocs(AddUpdateCommand.java:213) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1056) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:421) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:375) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:312) > ~[?:?] > at > org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:76) > ~[?:?] > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:54) > ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:270) > ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:533) > ~[?:?] > at >
[jira] [Updated] (SOLR-17288) automatic preferredLeader
[ https://issues.apache.org/jira/browse/SOLR-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17288: Description: I'd like to have the preferredLeader replica property be assigned automatically by Solr (if configured to). It would be applied when replicas are created for a new shard, like collection creation and restoring a collection from a backup. Shard split is maybe different. I propose that a [ReplicaPlacement|https://github.com/apache/solr/blob/branch_9_6/solr/core/src/java/org/apache/solr/cluster/placement/ReplicaPlacement.java#L35] decision, computed by a PlacementPlugin, include a "preferredLeader" boolean. Alternatively, we could alter PlacementPlugin.computePlacements method contract to further state that the returned order is significant -- that replicas are created in iteration order. Furthermore, the consumer (in Assign) would mark the first replica as preferredLeader. A default algorithm could simply choose a preferredLeader at random. was: I'd like to have the preferredLeader replica property be assigned automatically by Solr (if configured to). I propose that a [ReplicaPlacement|https://github.com/apache/solr/blob/branch_9_6/solr/core/src/java/org/apache/solr/cluster/placement/ReplicaPlacement.java#L35] decision, computed by a PlacementPlugin, include a "preferredLeader" boolean. Alternatively, we could alter PlacementPlugin.computePlacements method contract to further state that the returned order is significant -- that replicas are created in iteration order. Furthermore, the consumer (in Assign) would mark the first replica as preferredLeader. A default algorithm could simply choose a preferredLeader at random. > automatic preferredLeader > - > > Key: SOLR-17288 > URL: https://issues.apache.org/jira/browse/SOLR-17288 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: David Smiley >Priority: Major > > I'd like to have the preferredLeader replica property be assigned > automatically by Solr (if configured to). It would be applied when replicas > are created for a new shard, like collection creation and restoring a > collection from a backup. Shard split is maybe different. > I propose that a > [ReplicaPlacement|https://github.com/apache/solr/blob/branch_9_6/solr/core/src/java/org/apache/solr/cluster/placement/ReplicaPlacement.java#L35] > decision, computed by a PlacementPlugin, include a "preferredLeader" > boolean. Alternatively, we could alter PlacementPlugin.computePlacements > method contract to further state that the returned order is significant -- > that replicas are created in iteration order. Furthermore, the consumer (in > Assign) would mark the first replica as preferredLeader. > A default algorithm could simply choose a preferredLeader at random. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17288) automatic preferredLeader
David Smiley created SOLR-17288: --- Summary: automatic preferredLeader Key: SOLR-17288 URL: https://issues.apache.org/jira/browse/SOLR-17288 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: SolrCloud Reporter: David Smiley I'd like to have the preferredLeader replica property be assigned automatically by Solr (if configured to). I propose that a [ReplicaPlacement|https://github.com/apache/solr/blob/branch_9_6/solr/core/src/java/org/apache/solr/cluster/placement/ReplicaPlacement.java#L35] decision, computed by a PlacementPlugin, include a "preferredLeader" boolean. Alternatively, we could alter PlacementPlugin.computePlacements method contract to further state that the returned order is significant -- that replicas are created in iteration order. Furthermore, the consumer (in Assign) would mark the first replica as preferredLeader. A default algorithm could simply choose a preferredLeader at random. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-10780) A new collection property autoRebalanceLeaders
[ https://issues.apache.org/jira/browse/SOLR-10780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846106#comment-17846106 ] David Smiley commented on SOLR-10780: - I'd rather see leadership be sticky (e.g. via preferredLeader), maybe even by default, rather than having to explicitly rebalance. Thus a "preferred" leader would recognize itself to be such (e.g. due to explicit assignment and/or perhaps something automatic TBD), and seek to become the leader on its own (e.g. on becoming state=ACTIVE) without anything more heavyweight as this issue describes. > A new collection property autoRebalanceLeaders > --- > > Key: SOLR-10780 > URL: https://issues.apache.org/jira/browse/SOLR-10780 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Noble Paul >Priority: Major > > In solrcloud , the first replica to get started in a given shard becomes the > leader of that shard. This is a problem during cluster restarts. the first > node to get started have al leaders and that node ends up being very heavily > loaded. The solution we have today is to invoke a REBALANCELEADERS command > explicitly so that the system ends up with a uniform distribution of leaders > across nodes. This is a manual operation and we can make the system do it > automatically. > so each collection can have an {{autoRebalanceLeaders}} flag . If it is set > to true whenever a replica becomes {{ACTIVE}} in a shard , a > {{REBALANCELEADER}} is invoked for that shard -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-11999) RebalanceLeaders API should not require a preferredLeader property
[ https://issues.apache.org/jira/browse/SOLR-11999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-11999. - Resolution: Won't Fix This doesn't seem useful. Without preferredLeader, leadership assignment is very arbitrary and unstable. So why not use preferredLeader? If it's because it wasn't assigned in the first place, well yeah, you should do that then! Let's make it easier to assign a preferred leader. Marking as won't-fix; feel free to re-open. > RebalanceLeaders API should not require a preferredLeader property > -- > > Key: SOLR-11999 > URL: https://issues.apache.org/jira/browse/SOLR-11999 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Reporter: Shalin Shekhar Mangar >Priority: Major > > Rebalance leaders API requires that nodes be set with preferredLeaders > property. But in theory this is not required. We can choose replicas on > unique nodes to be leaders in the absence of the preferredLeader property. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17287) RESTORECORE should reset/clear the UpdateLog
David Smiley created SOLR-17287: --- Summary: RESTORECORE should reset/clear the UpdateLog Key: SOLR-17287 URL: https://issues.apache.org/jira/browse/SOLR-17287 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: David Smiley I think the semantics of a RESTORECORE (the core level API, not SolrCloud) should be wholistic to the core and thus consider the UpdateLog. As such, I think the UpdateLog should be left in an empty state and be ACTIVE. Recently, in SOLR-16924, it was enhanced to call UpdateLog.applyBufferedUpdates with the goal of transitioning from state BUFFERING to ACTIVE but that doesn't do anything if it's not in a buffering state to begin with (it'll be BUFFERING in SolrCloud always; isn't obvious). To prove there is a problem, I modified TestRestoreCore.testSimpleRestore (a good test!) to have a configured UpdateLog and I used RTG after the restore to see if I could get a document that was added *after* the backup was performed. I could. It doesn't matter if someone doesn't use RTG, it's just a means of demonstrating the state is dirty; it should be empty. Thus if a node crashed after a restore, the buffer would be replayed on startup for stuff added prior to the RESTORECORE, which isn't what we want. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845332#comment-17845332 ] David Smiley commented on SOLR-13350: - I suspect timeAllowed support could be added with relative ease. For reasons above, it won't work OOTB though. > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 11h 20m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16924) Restore: Have RESTORECORE set the UpdateLog state
[ https://issues.apache.org/jira/browse/SOLR-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845130#comment-17845130 ] David Smiley commented on SOLR-16924: - > (Funnily enough, this is actually an idea I got from you! I'll add the link > in here if I can dig it up...) Yes I do remember now :) it makes sense. It'd be helpful if classes distinctly "V1" had javadocs saying so. I'll move the logic. Probably not worth a JIRA. When I see a package "handler", I'm thinking request handlers, thus API (and may include implementation). Even the package-info.java says it's for SolrRequestHandlers. But this class doesn't end in Handler so okay, it's not a handler. I'm good with it. > Restore: Have RESTORECORE set the UpdateLog state > -- > > Key: SOLR-16924 > URL: https://issues.apache.org/jira/browse/SOLR-16924 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Priority: Minor > Fix For: 9.5 > > Time Spent: 50m > Remaining Estimate: 0h > > This is a refactoring improvement designed to simplify & clarify a step in > collection restores. One of the final phases of RestoreCmd (collection > restore) is to call REQUESTAPPLYUPDATES on each newly restored replica in > order to transition the state of the UpdateLog to ACTIVE (not actually to > apply updates). The underlying call on the UpdateLog could instead be done > inside RESTORECORE at the end with explanatory comments as to the intent. I > think this makes more sense that RESTORECORE finish with its UpdateLog ready. > And it's strange/curious to see requests in the cluster to apply updates > from an updateLog when there is none to do! Adding clarifying comments is > important. > See my comment: > https://issues.apache.org/jira/browse/SOLR-12065?focusedCommentId=17751792=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17751792 > I think there isn't any back-compat concern. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16743) Auto reload keystore/truststore on change
[ https://issues.apache.org/jira/browse/SOLR-16743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845015#comment-17845015 ] David Smiley commented on SOLR-16743: - Where I work, we use rotating keys but we don't need changes as invasive as this (e.g. with the extra code/complexity added here to accompany it). At the time we used a company Java "KeyStore" that internally dynamically reloads. A bit of custom glue creates a custom SSLContext, and we call `org.apache.solr.client.solrj.impl.Http2SolrClient#setDefaultSSLConfig`. Presently I would recommend users use Managed-mesh/Istio with rotating keys, which is more scalable in terms of integration effort, complexity, maintenance than cusotmizing/configurable SSL on each and every service. > Auto reload keystore/truststore on change > - > > Key: SOLR-16743 > URL: https://issues.apache.org/jira/browse/SOLR-16743 > Project: Solr > Issue Type: Improvement > Components: Server, SolrJ >Reporter: Houston Putman >Assignee: Tomas Eduardo Fernandez Lobbe >Priority: Major > Fix For: main (10.0), 9.5 > > Time Spent: 1h > Remaining Estimate: 0h > > Currently everyone who uses Solr with SSL must restart their clusters when > new certificates are created. > Jetty comes with an > [ssl-reload|https://www.eclipse.org/jetty/documentation/jetty-10/operations-guide/index.html#og-module-ssl-reload] > module for reloading the server's keystore. > For the client we would likely need to reload the truststore, but that > requires more investigation. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17286) HttpSolrCall.remoteQuery should use Jetty HttpClient
David Smiley created SOLR-17286: --- Summary: HttpSolrCall.remoteQuery should use Jetty HttpClient Key: SOLR-17286 URL: https://issues.apache.org/jira/browse/SOLR-17286 Project: Solr Issue Type: Sub-task Reporter: David Smiley There is Apache HttpClient usage in HttpSolrCall to do a "remote query". It should switch to Jetty HttpClient. Looking at the code details, I don't think org.apache.solr.servlet.CoreContainerProvider#httpClient (the field and getter here) needs to exist. It's just a reference but it can be looked up on-demand from the CoreContainer with less ceremony. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17286) HttpSolrCall.remoteQuery should use Jetty HttpClient
[ https://issues.apache.org/jira/browse/SOLR-17286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17286: Labels: newdev (was: ) > HttpSolrCall.remoteQuery should use Jetty HttpClient > > > Key: SOLR-17286 > URL: https://issues.apache.org/jira/browse/SOLR-17286 > Project: Solr > Issue Type: Sub-task >Reporter: David Smiley >Priority: Major > Labels: newdev > > There is Apache HttpClient usage in HttpSolrCall to do a "remote query". It > should switch to Jetty HttpClient. > Looking at the code details, I don't think > org.apache.solr.servlet.CoreContainerProvider#httpClient (the field and > getter here) needs to exist. It's just a reference but it can be looked up > on-demand from the CoreContainer with less ceremony. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16505) Switch UpdateShardHandler.getRecoveryOnlyHttpClient to Jetty HTTP2
[ https://issues.apache.org/jira/browse/SOLR-16505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844985#comment-17844985 ] David Smiley commented on SOLR-16505: - [~markrmiller] , based on the callers of org.apache.solr.update.UpdateShardHandler#getRecoveryOnlyHttpClient, "recovery" refers to IndexFetcher and RecoveryStrategy. Should this be expanded slightly to include PeerSync (see PeerSyncWithLeader using the "default" client), or not if there's a distinct between index replication and peer-sync with respect to possible configuration needs like timeouts and threads. RecoveryStrategy can initiate peer sync, after all. If you agree, shouldn't this include SyncStrategy, which is related to PeerSync? At a glance, it's not clear why SyncStrategy and RecoveryStrategy aren't the same thing. Javadocs could be added to elaborate on what the scope of this recovery client is for, and also why it exists in the first place. There's a sad "bus factor" in this part of Solr, I think, so I really appreciate it when you share your insights. BTW, I debate UpdateShardHandler's name [here|https://github.com/apache/solr/pull/2351#discussion_r1595413371]; you may have an opinion. > Switch UpdateShardHandler.getRecoveryOnlyHttpClient to Jetty HTTP2 > -- > > Key: SOLR-16505 > URL: https://issues.apache.org/jira/browse/SOLR-16505 > Project: Solr > Issue Type: Sub-task >Reporter: David Smiley >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > This method and its callers (only RecoveryStrategy) should be converted to a > Jetty HTTP2 client. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17285) Move RemoteSolrException to SolrClient in v10
David Smiley created SOLR-17285: --- Summary: Move RemoteSolrException to SolrClient in v10 Key: SOLR-17285 URL: https://issues.apache.org/jira/browse/SOLR-17285 Project: Solr Issue Type: Sub-task Components: SolrJ Reporter: David Smiley RemoteSolrException lives in BaseHttpSolrClient. BaseHttpSolrClient should be deprecated; it's sort of replaced by HttpSolrClientBase. Even though this exception is only for Http, SolrClient is a decent parent class. Or make top level. To make this transition from 9x to 10x better, we could simply add new classes without removing the old ones in 9x. The old can subclass the new. Eventually all of BaseHttpSolrClient will be removed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17196) Update IndexFetcher Class to Use Http2SolrClient
[ https://issues.apache.org/jira/browse/SOLR-17196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844861#comment-17844861 ] David Smiley commented on SOLR-17196: - In retrospect, SOLR-16505 was so much work that splitting this off was wise. > Update IndexFetcher Class to Use Http2SolrClient > > > Key: SOLR-17196 > URL: https://issues.apache.org/jira/browse/SOLR-17196 > Project: Solr > Issue Type: Sub-task >Reporter: Sanjay Dutt >Priority: Major > > Current implementation use the older HttpSolrClient and created client using > PoolingHttpClientConnectionManager calling > `getUpdateShardHandler().getRecoveryOnlyConnectionManager()` > Below is the code that used to create HttpSolrClient in IndexFetcher class. > {code:java} > HttpClientUtil.createClient( > httpClientParams, > core.getCoreContainer().getUpdateShardHandler().getRecoveryOnlyConnectionManager(), > true, > executor);{code} > We are going to replace the above implementation using > Http2SolrClient.Builder and calling shardHandler.getRecoveryOnlyClient to use > the common pool for recovery task to create Http2SolrClient. > > {code:java} > Http2SolrClient httpClient = > new Http2SolrClient.Builder(leaderBaseUrl) > .withHttpClient( > core.getCoreContainer().getUpdateShardHandler().getRecoveryOnlyHttpClient()) > .withBasicAuthCredentials(httpBasicAuthUser, httpBasicAuthPassword) > .build(); > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17196) Update IndexFetcher Class to Use Http2SolrClient
[ https://issues.apache.org/jira/browse/SOLR-17196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17196. - Resolution: Duplicate > Update IndexFetcher Class to Use Http2SolrClient > > > Key: SOLR-17196 > URL: https://issues.apache.org/jira/browse/SOLR-17196 > Project: Solr > Issue Type: Sub-task >Reporter: Sanjay Dutt >Priority: Major > > Current implementation use the older HttpSolrClient and created client using > PoolingHttpClientConnectionManager calling > `getUpdateShardHandler().getRecoveryOnlyConnectionManager()` > Below is the code that used to create HttpSolrClient in IndexFetcher class. > {code:java} > HttpClientUtil.createClient( > httpClientParams, > core.getCoreContainer().getUpdateShardHandler().getRecoveryOnlyConnectionManager(), > true, > executor);{code} > We are going to replace the above implementation using > Http2SolrClient.Builder and calling shardHandler.getRecoveryOnlyClient to use > the common pool for recovery task to create Http2SolrClient. > > {code:java} > Http2SolrClient httpClient = > new Http2SolrClient.Builder(leaderBaseUrl) > .withHttpClient( > core.getCoreContainer().getUpdateShardHandler().getRecoveryOnlyHttpClient()) > .withBasicAuthCredentials(httpBasicAuthUser, httpBasicAuthPassword) > .build(); > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17192) Maximum-fields-per-core soft limit
[ https://issues.apache.org/jira/browse/SOLR-17192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844785#comment-17844785 ] David Smiley commented on SOLR-17192: - I saw that you listed my name first in CHANGES.txt, which is the position of the person I think of as most responsible for making the work item happen. Clearly, that honor goes to you Jason. I use this distinction when filtering work items in CHANGES.txt to attribute work to me and my colleagues to see how we contribute to Solr as lead contributors (vs more supporting roles). > Maximum-fields-per-core soft limit > -- > > Key: SOLR-17192 > URL: https://issues.apache.org/jira/browse/SOLR-17192 > Project: Solr > Issue Type: Sub-task > Components: Schema and Analysis >Affects Versions: main (10.0), 9.5.0 >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Fix For: main (10.0), 9.7 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > Solr isn't infinitely scalable when it comes to the number of fields in each > core/collection. Most deployments start to experience problems any time a > core has upwards of a few hundred fields. Usually this doesn't exhibit > itself right away. instead waiting until segment-merge or some other time to > rear its head. > Sometimes users hit this through intentional schema design. Often however, > it happens "accidentally" due to (mis-)use of Solr's "dynamic fields" feature. > We should add a configurable soft-limit, of the type described in SOLR-17191, > to prevent users from unknowingly getting into this state. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17275) Major performance regression of CloudSolrClient in Solr 9.6.0 when using aliases
[ https://issues.apache.org/jira/browse/SOLR-17275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17275: Fix Version/s: 9.6.1 Ouch! The fix should go in 9.6.1. Maybe even aliases could be checked first (why not). > Major performance regression of CloudSolrClient in Solr 9.6.0 when using > aliases > > > Key: SOLR-17275 > URL: https://issues.apache.org/jira/browse/SOLR-17275 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.6.0 > Environment: SolrJ 9.6.0, Ubuntu 22.04, Java 17 >Reporter: Rafał Harabień >Priority: Major > Fix For: 9.6.1 > > Attachments: image-2024-05-06-17-23-42-236.png > > > I observe worse performance of CloudSolrClient after upgrading from SolrJ > 9.5.0 to 9.6.0, especially on p99. > p99 jumped from ~25 ms to ~400 ms > p90 jumped from ~9.9 ms to ~22 ms > p75 jumped from ~7 ms to ~11 ms > p50 jumped from ~4.5 ms to ~7.5 ms > Screenshot from Grafana (at ~14:30 was deployed the new version): > !image-2024-05-06-17-23-42-236.png! > I've got a thread-dump and I can see many threads waiting in > [ZkStateReader.forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503]: > {noformat} > Thread info: "suggest-solrThreadPool-thread-52" prio=5 Id=600 BLOCKED on > org.apache.solr.common.cloud.ZkStateReader@62e6bc3d owned by > "suggest-solrThreadPool-thread-34" Id=582 > at > app//org.apache.solr.common.cloud.ZkStateReader.forceUpdateCollection(ZkStateReader.java:506) > - blocked on org.apache.solr.common.cloud.ZkStateReader@62e6bc3d > at > app//org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider.getState(ZkClientClusterStateProvider.java:155) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.resolveAliases(CloudSolrClient.java:1207) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1099) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:892) > at > app//org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:820) > at > app//org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:255) > at > app//org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:927) > ... > Number of locked synchronizers = 1 > - java.util.concurrent.ThreadPoolExecutor$Worker@1beb7ed3 > {noformat} > At the same time qTime from Solr hasn't changed so I'm pretty sure it's a > client regression. > I've tried reproducing it locally and I can see > [forceUpdateCollection|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkStateReader.java#L503] > function being called for every request in my application. I can see that > [this|https://github.com/apache/solr/commit/8cf552aa3642be473c6a08ce44feceb9cbe396d7] > commit > changed the logic in ZkClientClusterStateProvider.getState so the mentioned > function gets called if clusterState.getCollectionRef [returns > null|https://github.com/apache/solr/blob/f8e5a93c11267e13b7b43005a428bfb910ac6e57/solr/solrj-zookeeper/src/java/org/apache/solr/client/solrj/impl/ZkClientClusterStateProvider.java#L151]. > In 9.5.0 it wasn't the case (forceUpdateCollection was not called in this > place). I can see in the debugger that getCollectionRef only supports > collections and not aliases (collectionStates map contains only collections). > In my application all collections are referenced using aliases so I guess > that's why I can see the regression in Solr response time. > I am not familiar with the code enough to prepare a PR but I hope this > insight will be enough to fix this issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17274) atomic update error when using json w/ multiple modifiers
[ https://issues.apache.org/jira/browse/SOLR-17274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1787#comment-1787 ] David Smiley commented on SOLR-17274: - Ah right, if the schema isn't child doc enabled, then isChildDoc can return false. I like it! Deja-vu... probably an existing idea written somewhere. If you can submit a PR, that'd be great. Call {{IndexSchema.isUsableForChildDocs() }}to see if we can return early. > atomic update error when using json w/ multiple modifiers > - > > Key: SOLR-17274 > URL: https://issues.apache.org/jira/browse/SOLR-17274 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 9.0, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6 > Environment: Observed first in 9.5.0 w/ OpenJDK 11, then reproduced > this simple test case in 9.6.0. >Reporter: Calvin Smith >Priority: Major > > I ran into a problem doing a json atomic update that tries to both `add` and > `remove` a value for a multivalued field in a single update. I saw it > initially in an instance that runs 9.5.0, and reproduced a minimal example > using Solr 9.6.0. > {{The only fields defined in the schema are:}} > {code:java} > required="true" stored="true"/> > > {code} > {{`id` is also present, so I'm supplying docs with > just an `id` and a multivalued `name` field. The real setup is more complex, > but this is a minimal test case to illustrate the problem.}} > {{Starting with an empty index, I add the following doc to the index > successfully:}} > {code:java} > {"id": "1", "name": ["John Doe", "Jane Doe"]}{code} > {{And I can query it, seeing the expected result:}} > {code:java} > { > "responseHeader":{ > "status":0, > "QTime":23, > "params":{ > "q":"name:*" > } > }, > "response":{ > "numFound":1, > "start":0, > "numFoundExact":true, > "docs":[{ > "id":"1", > "name":["John Doe","Jane Doe"], > "_version_":1797873599884820480 > }] > } > }{code} > {{Next, I send an atomic update to modify the `name` field of that document > by removing `Jane Doe` and adding `Janet Doe`:}} > {code:java} > {"id": "1", "name": {"add": "Janet Doe", "remove": "Jane Doe"}}{code} > {{This atomic update that does both an `add` and a `remove` is something that > used to work for us under Solr 6.6, but we just noticed that it fails in our > production 9.5 instance and in 9.6, which I just downloaded to test against.}} > {{The error in the solr.log indicates that Solr mistakenly interprets the > `\{"add": "Janet Doe", "remove": "Jane Doe"}` as a nested document and then > throws an exception because our schema doesn't have the `{_}root{_}` field > that would be expected if we were using nested docs (which we don't use).}} > {{Here's the full stack trace from `solr.log` (version 9.6.0):}} > {noformat} > 2024-05-01 17:49:02.479 ERROR (qtp2059461664-30-0.0.0.0-3) [c: s: r: x:atris > t:0.0.0.0-3] o.a.s.h.RequestHandlerBase Client exception => > org.apache.solr.common.SolrException: Unable to index docs with children: the > schema must include definitions for both a uniqueKey field and the '_root_' > field, using the exact same fieldType > at > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) > org.apache.solr.common.SolrException: Unable to index docs with children: the > schema must include definitions for both a uniqueKey field and the '_root_' > field, using the exact same fieldType > at > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) > ~[?:?] > at > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:153) > ~[?:?] > at > org.apache.solr.update.AddUpdateCommand.makeLuceneDocs(AddUpdateCommand.java:213) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1056) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:421) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:375) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:312) > ~[?:?] > at > org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:76) > ~[?:?] > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:54) > ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:270) > ~[?:?] > at >
[jira] [Commented] (SOLR-16924) Restore: Have RESTORECORE set the UpdateLog state
[ https://issues.apache.org/jira/browse/SOLR-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844085#comment-17844085 ] David Smiley commented on SOLR-16924: - [~gerlowskija] I could use your insights on something here, please. I am looking back at this months later and got very confused for a while until I realized we have two RestoreCore classes, one in {{org.apache.solr.handler.admin.api}} and the other in {{{}org.apache.solr.handler{}}}. Wow; ok! The first implements the V2 API and calls the second, the V1 API. The change in this PR was placed at the end of the V2 API without consideration of the ambiguity. Uh oh! Thus it would seem the change will not take effect for the V1 API {*}but{*}, I see in SOLR-16490 that RestoreCoreOp (yet another layer below the CoreAdminHandler but above the real impl) calls the V2 RestoreCore. Wow again; I didn't expect that! I wonder then, is V1 RestoreCore invoked in any other code path? I see two – {{ReplicationHandler.restore()}} and {{{}InstallCoreData.installCoreData(){}}}. Are there any problems or bugs here? Like, _should_ the change in this PR be placed at the end of V1 instead? Isn't it wrong for CoreAdminHandler to be calling v2 stuff? On second thought, I could rationalize that as we transition the migration. Should the RestoreCore classes be merged? > Restore: Have RESTORECORE set the UpdateLog state > -- > > Key: SOLR-16924 > URL: https://issues.apache.org/jira/browse/SOLR-16924 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Priority: Minor > Fix For: 9.5 > > Time Spent: 50m > Remaining Estimate: 0h > > This is a refactoring improvement designed to simplify & clarify a step in > collection restores. One of the final phases of RestoreCmd (collection > restore) is to call REQUESTAPPLYUPDATES on each newly restored replica in > order to transition the state of the UpdateLog to ACTIVE (not actually to > apply updates). The underlying call on the UpdateLog could instead be done > inside RESTORECORE at the end with explanatory comments as to the intent. I > think this makes more sense that RESTORECORE finish with its UpdateLog ready. > And it's strange/curious to see requests in the cluster to apply updates > from an updateLog when there is none to do! Adding clarifying comments is > important. > See my comment: > https://issues.apache.org/jira/browse/SOLR-12065?focusedCommentId=17751792=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17751792 > I think there isn't any back-compat concern. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17274) atomic update error when using json w/ multiple modifiers
[ https://issues.apache.org/jira/browse/SOLR-17274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843522#comment-17843522 ] David Smiley commented on SOLR-17274: - The JSON syntax is not as rich as XML or "javabin" (native SolrJ), which are typed. *Perhaps* a solution is to assume certain operation names (e.g. "set", "add", "remove", etc.)? Ugh; it's kind of inelegant though. I wish we had used underscores to make them more special. At least this problem is only specific to JSON; you have a couple other options. > atomic update error when using json w/ multiple modifiers > - > > Key: SOLR-17274 > URL: https://issues.apache.org/jira/browse/SOLR-17274 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 9.0, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6 > Environment: Observed first in 9.5.0 w/ OpenJDK 11, then reproduced > this simple test case in 9.6.0. >Reporter: Calvin Smith >Priority: Major > > I ran into a problem doing a json atomic update that tries to both `add` and > `remove` a value for a multivalued field in a single update. I saw it > initially in an instance that runs 9.5.0, and reproduced a minimal example > using Solr 9.6.0. > {{The only fields defined in the schema are:}} > {code:java} > required="true" stored="true"/> > > {code} > {{`id` is also present, so I'm supplying docs with > just an `id` and a multivalued `name` field. The real setup is more complex, > but this is a minimal test case to illustrate the problem.}} > {{Starting with an empty index, I add the following doc to the index > successfully:}} > {code:java} > {"id": "1", "name": ["John Doe", "Jane Doe"]}{code} > {{And I can query it, seeing the expected result:}} > {code:java} > { > "responseHeader":{ > "status":0, > "QTime":23, > "params":{ > "q":"name:*" > } > }, > "response":{ > "numFound":1, > "start":0, > "numFoundExact":true, > "docs":[{ > "id":"1", > "name":["John Doe","Jane Doe"], > "_version_":1797873599884820480 > }] > } > }{code} > {{Next, I send an atomic update to modify the `name` field of that document > by removing `Jane Doe` and adding `Janet Doe`:}} > {code:java} > {"id": "1", "name": {"add": "Janet Doe", "remove": "Jane Doe"}}{code} > {{This atomic update that does both an `add` and a `remove` is something that > used to work for us under Solr 6.6, but we just noticed that it fails in our > production 9.5 instance and in 9.6, which I just downloaded to test against.}} > {{The error in the solr.log indicates that Solr mistakenly interprets the > `\{"add": "Janet Doe", "remove": "Jane Doe"}` as a nested document and then > throws an exception because our schema doesn't have the `{_}root{_}` field > that would be expected if we were using nested docs (which we don't use).}} > {{Here's the full stack trace from `solr.log` (version 9.6.0):}} > {noformat} > 2024-05-01 17:49:02.479 ERROR (qtp2059461664-30-0.0.0.0-3) [c: s: r: x:atris > t:0.0.0.0-3] o.a.s.h.RequestHandlerBase Client exception => > org.apache.solr.common.SolrException: Unable to index docs with children: the > schema must include definitions for both a uniqueKey field and the '_root_' > field, using the exact same fieldType > at > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) > org.apache.solr.common.SolrException: Unable to index docs with children: the > schema must include definitions for both a uniqueKey field and the '_root_' > field, using the exact same fieldType > at > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) > ~[?:?] > at > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:153) > ~[?:?] > at > org.apache.solr.update.AddUpdateCommand.makeLuceneDocs(AddUpdateCommand.java:213) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1056) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:421) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:375) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:312) > ~[?:?] > at > org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:76) > ~[?:?] > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:54) > ~[?:?] > at >
[jira] [Updated] (SOLR-17274) atomic update error when using json w/ multiple modifiers
[ https://issues.apache.org/jira/browse/SOLR-17274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17274: Component/s: update (was: UpdateRequestProcessors) > atomic update error when using json w/ multiple modifiers > - > > Key: SOLR-17274 > URL: https://issues.apache.org/jira/browse/SOLR-17274 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 9.0, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6 > Environment: Observed first in 9.5.0 w/ OpenJDK 11, then reproduced > this simple test case in 9.6.0. >Reporter: Calvin Smith >Priority: Major > > I ran into a problem doing a json atomic update that tries to both `add` and > `remove` a value for a multivalued field in a single update. I saw it > initially in an instance that runs 9.5.0, and reproduced a minimal example > using Solr 9.6.0. > {{The only fields defined in the schema are:}} > {code:java} > required="true" stored="true"/> > > {code} > {{`id` is also present, so I'm supplying docs with > just an `id` and a multivalued `name` field. The real setup is more complex, > but this is a minimal test case to illustrate the problem.}} > {{Starting with an empty index, I add the following doc to the index > successfully:}} > {code:java} > {"id": "1", "name": ["John Doe", "Jane Doe"]}{code} > {{And I can query it, seeing the expected result:}} > {code:java} > { > "responseHeader":{ > "status":0, > "QTime":23, > "params":{ > "q":"name:*" > } > }, > "response":{ > "numFound":1, > "start":0, > "numFoundExact":true, > "docs":[{ > "id":"1", > "name":["John Doe","Jane Doe"], > "_version_":1797873599884820480 > }] > } > }{code} > {{Next, I send an atomic update to modify the `name` field of that document > by removing `Jane Doe` and adding `Janet Doe`:}} > {code:java} > {"id": "1", "name": {"add": "Janet Doe", "remove": "Jane Doe"}}{code} > {{This atomic update that does both an `add` and a `remove` is something that > used to work for us under Solr 6.6, but we just noticed that it fails in our > production 9.5 instance and in 9.6, which I just downloaded to test against.}} > {{The error in the solr.log indicates that Solr mistakenly interprets the > `\{"add": "Janet Doe", "remove": "Jane Doe"}` as a nested document and then > throws an exception because our schema doesn't have the `{_}root{_}` field > that would be expected if we were using nested docs (which we don't use).}} > {{Here's the full stack trace from `solr.log` (version 9.6.0):}} > {noformat} > 2024-05-01 17:49:02.479 ERROR (qtp2059461664-30-0.0.0.0-3) [c: s: r: x:atris > t:0.0.0.0-3] o.a.s.h.RequestHandlerBase Client exception => > org.apache.solr.common.SolrException: Unable to index docs with children: the > schema must include definitions for both a uniqueKey field and the '_root_' > field, using the exact same fieldType > at > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) > org.apache.solr.common.SolrException: Unable to index docs with children: the > schema must include definitions for both a uniqueKey field and the '_root_' > field, using the exact same fieldType > at > org.apache.solr.update.DocumentBuilder.unexpectedNestedDocException(DocumentBuilder.java:369) > ~[?:?] > at > org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:153) > ~[?:?] > at > org.apache.solr.update.AddUpdateCommand.makeLuceneDocs(AddUpdateCommand.java:213) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.updateDocOrDocValues(DirectUpdateHandler2.java:1056) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.doNormalUpdate(DirectUpdateHandler2.java:421) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:375) > ~[?:?] > at > org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:312) > ~[?:?] > at > org.apache.solr.update.processor.RunUpdateProcessorFactory$RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:76) > ~[?:?] > at > org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:54) > ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:270) > ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.doVersionAdd(DistributedUpdateProcessor.java:533) > ~[?:?] > at > org.apache.solr.update.processor.DistributedUpdateProcessor.lambda$versionAdd$0(DistributedUpdateProcessor.java:358) >
[jira] [Updated] (SOLR-17270) Umbrella: Per Replica State as default
[ https://issues.apache.org/jira/browse/SOLR-17270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17270: Component/s: SolrCloud > Umbrella: Per Replica State as default > -- > > Key: SOLR-17270 > URL: https://issues.apache.org/jira/browse/SOLR-17270 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Justin Sweeney >Priority: Major > Labels: Umbrella > > The concept of Per Replica State (PRS) as a part of state management in Solr > has existed since 8.8.0. While PRS is in use in some places, there is still > work to be done to get it to a place where we can move to PRS as the default > behavior in Solr. > > This umbrella ticket will encapsulate the work required to get to this point > and enable PRS as the default behavior for Solr 10. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17088) TestPrepRecovery.testLeaderNotResponding fails much more lately
[ https://issues.apache.org/jira/browse/SOLR-17088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842821#comment-17842821 ] David Smiley commented on SOLR-17088: - Wow; you deserve an award of some kind for root causing that one! bq. does not use the default cloud solr.xml, which has options for setting the distributedClusterStateUpdate vs using the overseer This part is sad; I'm reminded of a wish-list item that I have -- I just filed SOLR-17267 to capture it. With that, the default and basically all test solr.xml files need not even mention distributedClusterStateUpdate nor many other things. > TestPrepRecovery.testLeaderNotResponding fails much more lately > --- > > Key: SOLR-17088 > URL: https://issues.apache.org/jira/browse/SOLR-17088 > Project: Solr > Issue Type: Test >Reporter: David Smiley >Assignee: Houston Putman >Priority: Minor > Attachments: 2023-11-27 fail.log.txt > > Time Spent: 10m > Remaining Estimate: 0h > > I'll attach logs. I didn't try and root cause. [Increased in test frequency > lately|http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.cloud.TestPrepRecovery.testLeaderNotResponding]. > All recent failures happen on main, not 9x. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17160) Bulk admin operations may fail because of max tracked requests
[ https://issues.apache.org/jira/browse/SOLR-17160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842732#comment-17842732 ] David Smiley commented on SOLR-17160: - BTW I think Denial of Service should be a non-concern for admin APIs. > Bulk admin operations may fail because of max tracked requests > -- > > Key: SOLR-17160 > URL: https://issues.apache.org/jira/browse/SOLR-17160 > Project: Solr > Issue Type: Bug > Components: Backup/Restore >Affects Versions: 8.11, 9.5 >Reporter: Pierre Salagnac >Priority: Minor > Time Spent: 1h 20m > Remaining Estimate: 0h > > In {{{}CoreAdminHandler{}}}, we maintain in-memory the list of in-flight > requests and completed/failed request. > _Note they are core/replica level async requests, and not top level requests > which mostly at the collection level. Top level requests are tracked by > storing the async ID in a Zookeeper node, which is not related to this > ticket._ > > For completed/failed requests, we only track a maximum of 100 requests by > dropping the oldest ones. The typical client in > {{CollectionHandlingUtils.waitForCoreAdminAsyncCallToComplete()}} polls > status of the submitted requests, with a retry loop until requests are > completed. If for some reason we have more than 100 requests that complete or > fail on a node before all statuses are polled by the client, the statuses are > lost and the client will fail with an unexpected error similar to: > {{Invalid status request for requestId: '{_}{_}' - 'notfound'. Retried > __ times}} > > Instead of having a hard limit for the number of requests we track, we could > have time based eviction. I think it makes sense to keep status of a request > until a given timeout, and then drop it ignoring how many requests we > currently track. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-16577) Core load issues are not always logged
[ https://issues.apache.org/jira/browse/SOLR-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842156#comment-17842156 ] David Smiley commented on SOLR-16577: - What a surprise that is; sorry to hear that! I think upping the timeout (e.g. to infinity) for this scenario makes the most sense to me. I still don't believe in the needless use of a Future here. > Core load issues are not always logged > -- > > Key: SOLR-16577 > URL: https://issues.apache.org/jira/browse/SOLR-16577 > Project: Solr > Issue Type: Improvement >Reporter: Haythem Khiri >Assignee: David Smiley >Priority: Minor > Fix For: 9.5 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > It's possible for a core load failure to not have its cause logged. At least > the failure is tracked in a metric and one can do an admin request to fetch > the cause but really it ought to be logged so one can more easily see what > the problem is. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17058) Request param to disable distributed IDF request at query time
[ https://issues.apache.org/jira/browse/SOLR-17058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17058: Summary: Request param to disable distributed IDF request at query time (was: Request param to disable distributed stats request at query time) > Request param to disable distributed IDF request at query time > -- > > Key: SOLR-17058 > URL: https://issues.apache.org/jira/browse/SOLR-17058 > Project: Solr > Issue Type: New Feature > Components: query >Reporter: wei wang >Assignee: Mikhail Khludnev >Priority: Minor > Fix For: 9.6 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > When distributed IDF is enabled in solr cloud by adding one of the cache > implementations in solrconfig.xml > [https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#distributedidf], > each solr query will incur a distributed shard request to get term > statistics > "debug": { > "track": { > "rid": "-54", > "PARSE_QUERY": { > "http://192.168.0.34:8987/solr/shard2_replica_n1/": > { "QTime": “2”, > > "ElapsedTime": "13", > > "RequestPurpose": "GET_TERM_STATS", > … > > For queries that does not use distributed IDF information for scoring > such as terms filter by id, the stats request is not necessary. Hence I > propose to add a {{distrib.statsCache}} request param so that the distributed > stats request can be disabled at query time. > # {{distrib.statsCache}} defaults to {{{}true{}}}. When the param is not > present, there is no change to current distributed IDF behavior. > # When explicitly set {{{}distrib.statsCache{}}}{{{}=false{}}}, distributed > stats call is disabled for the current query. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-14675) CloudSolrClient requestAsync API
[ https://issues.apache.org/jira/browse/SOLR-14675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-14675: Summary: CloudSolrClient requestAsync API (was: [SolrJ] Http2SolrClient async request through CloudHttp2SolrClient) > CloudSolrClient requestAsync API > > > Key: SOLR-14675 > URL: https://issues.apache.org/jira/browse/SOLR-14675 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Reporter: Rishi Sankar >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > In Solr 8.7, org.apache.solr.client.solrj.impl.Http2SolrClient has an > asyncRequest method which supports making async requests with a callback > parameter. However, this method is only used internally by the > MockingHttp2SolrClient and LBHttp2SolrClient. I'd like to contribute a method > to the CloudHttp2SolrClient that allows for making an asynchronous request > with a callback parameter that can be passed down to the Http2SolrClient > asyncRequest call. > I've been coordinating with [~dsmiley] about making this change. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-14675) [SolrJ] Http2SolrClient async request through CloudHttp2SolrClient
[ https://issues.apache.org/jira/browse/SOLR-14675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-14675: Affects Version/s: (was: 8.7) > [SolrJ] Http2SolrClient async request through CloudHttp2SolrClient > -- > > Key: SOLR-14675 > URL: https://issues.apache.org/jira/browse/SOLR-14675 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Reporter: Rishi Sankar >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > In Solr 8.7, org.apache.solr.client.solrj.impl.Http2SolrClient has an > asyncRequest method which supports making async requests with a callback > parameter. However, this method is only used internally by the > MockingHttp2SolrClient and LBHttp2SolrClient. I'd like to contribute a method > to the CloudHttp2SolrClient that allows for making an asynchronous request > with a callback parameter that can be passed down to the Http2SolrClient > asyncRequest call. > I've been coordinating with [~dsmiley] about making this change. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-14675) [SolrJ] Http2SolrClient async request through CloudHttp2SolrClient
[ https://issues.apache.org/jira/browse/SOLR-14675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840998#comment-17840998 ] David Smiley commented on SOLR-14675: - Now that SOLR-14763 (async API refactoring of Http2SolrClient) is done, this issue should be unblocked for having the same API (requestAsync() returning a CompletableFuture). Not sure if the initial PR here, long closed has any bits of interest; probably not as it was superseded by https://github.com/apache/lucene-solr/pull/1770 which *does* have the CloudHttp2SolrClient part. That part did not survive into the API refactor in order to keep the scope manageable. > [SolrJ] Http2SolrClient async request through CloudHttp2SolrClient > -- > > Key: SOLR-14675 > URL: https://issues.apache.org/jira/browse/SOLR-14675 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Affects Versions: 8.7 >Reporter: Rishi Sankar >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > In Solr 8.7, org.apache.solr.client.solrj.impl.Http2SolrClient has an > asyncRequest method which supports making async requests with a callback > parameter. However, this method is only used internally by the > MockingHttp2SolrClient and LBHttp2SolrClient. I'd like to contribute a method > to the CloudHttp2SolrClient that allows for making an asynchronous request > with a callback parameter that can be passed down to the Http2SolrClient > asyncRequest call. > I've been coordinating with [~dsmiley] about making this change. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-14675) [SolrJ] Http2SolrClient async request through CloudHttp2SolrClient
[ https://issues.apache.org/jira/browse/SOLR-14675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-14675: Priority: Major (was: Minor) > [SolrJ] Http2SolrClient async request through CloudHttp2SolrClient > -- > > Key: SOLR-14675 > URL: https://issues.apache.org/jira/browse/SOLR-14675 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Affects Versions: 8.7 >Reporter: Rishi Sankar >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > In Solr 8.7, org.apache.solr.client.solrj.impl.Http2SolrClient has an > asyncRequest method which supports making async requests with a callback > parameter. However, this method is only used internally by the > MockingHttp2SolrClient and LBHttp2SolrClient. I'd like to contribute a method > to the CloudHttp2SolrClient that allows for making an asynchronous request > with a callback parameter that can be passed down to the Http2SolrClient > asyncRequest call. > I've been coordinating with [~dsmiley] about making this change. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-14763) SolrJ Client Async HTTP/2 Requests
[ https://issues.apache.org/jira/browse/SOLR-14763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-14763: Description: In SOLR-14354, [~caomanhdat] created an API to use Jetty async API to make more thread efficient HttpShardHandler requests. This added public async request APIs to Http2SolrClient and LBHttp2SolrClient. There are a few ways this API can be improved, that I will track in this issue: 1) Using a CompletableFuture-based async API signature, instead of using internal custom interfaces (Cancellable, AsyncListener) - based on [this discussion|https://lists.apache.org/thread.html/r548f318d9176c84ad1a4ed49ff182eeea9f82f26cb23e372244c8a23%40%3Cdev.lucene.apache.org%3E]. --- The below was removed from the scope of what was delivered in 9.6; a linked JIRA will address it. 2) An async API is also useful in other HTTP/2 Solr clients as well, particularly CloudHttp2SolrClient (SOLR-14675). I will add a requestAsync method to the SolrClient class, with a default method that initially throws an unsupported operation exception (maybe this can be later updated to use an executor to handle the async request as a default impl). For now, I'll override the default implementation in the Http2SolrClient and CloudHttp2SolrClient. was: In SOLR-14354, [~caomanhdat] created an API to use Jetty async API to make more thread efficient HttpShardHandler requests. This added public async request APIs to Http2SolrClient and LBHttp2SolrClient. There are a few ways this API can be improved, that I will track in this issue: 1) Using a CompletableFuture-based async API signature, instead of using internal custom interfaces (Cancellable, AsyncListener) - based on [this discussion|https://lists.apache.org/thread.html/r548f318d9176c84ad1a4ed49ff182eeea9f82f26cb23e372244c8a23%40%3Cdev.lucene.apache.org%3E]. 2) An async API is also useful in other HTTP/2 Solr clients as well, particularly CloudHttp2SolrClient (SOLR-14675). I will add a requestAsync method to the SolrClient class, with a default method that initially throws an unsupported operation exception (maybe this can be later updated to use an executor to handle the async request as a default impl). For now, I'll override the default implementation in the Http2SolrClient and CloudHttp2SolrClient. > SolrJ Client Async HTTP/2 Requests > -- > > Key: SOLR-14763 > URL: https://issues.apache.org/jira/browse/SOLR-14763 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Affects Versions: 8.7 >Reporter: Rishi Sankar >Assignee: James Dyer >Priority: Major > Fix For: main (10.0), 9.6 > > Time Spent: 7h 40m > Remaining Estimate: 0h > > In SOLR-14354, [~caomanhdat] created an API to use Jetty async API to make > more thread efficient HttpShardHandler requests. This added public async > request APIs to Http2SolrClient and LBHttp2SolrClient. There are a few ways > this API can be improved, that I will track in this issue: > 1) Using a CompletableFuture-based async API signature, instead of using > internal custom interfaces (Cancellable, AsyncListener) - based on [this > discussion|https://lists.apache.org/thread.html/r548f318d9176c84ad1a4ed49ff182eeea9f82f26cb23e372244c8a23%40%3Cdev.lucene.apache.org%3E]. > --- > The below was removed from the scope of what was delivered in 9.6; a linked > JIRA will address it. > 2) An async API is also useful in other HTTP/2 Solr clients as well, > particularly CloudHttp2SolrClient (SOLR-14675). I will add a requestAsync > method to the SolrClient class, with a default method that initially throws > an unsupported operation exception (maybe this can be later updated to use an > executor to handle the async request as a default impl). For now, I'll > override the default implementation in the Http2SolrClient and > CloudHttp2SolrClient. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17243) CloudSolrClient should support SolrRequest.getBasePath (a URL)
[ https://issues.apache.org/jira/browse/SOLR-17243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17243. - Resolution: Won't Fix > CloudSolrClient should support SolrRequest.getBasePath (a URL) > -- > > Key: SOLR-17243 > URL: https://issues.apache.org/jira/browse/SOLR-17243 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Reporter: David Smiley >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Sometimes you have a CloudSolrClient and you have a use-case where you want > to send a request to a specific node / URL. Perhaps uncommon for typical > users but it's common internally (within Solr) to encounter this. > SolrRequest.setBasePath / getBasePath (which is actually a URL!) is already > here for this but it only works on an Http SolrClient. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17255) ClientUtils.encodeLocalParamVal doesn't work with param refs, breaks SolrParams.toLocalParamsString
[ https://issues.apache.org/jira/browse/SOLR-17255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840578#comment-17840578 ] David Smiley commented on SOLR-17255: - Ouch; good catch! I recall writing that a long time ago. > ClientUtils.encodeLocalParamVal doesn't work with param refs, breaks > SolrParams.toLocalParamsString > --- > > Key: SOLR-17255 > URL: https://issues.apache.org/jira/browse/SOLR-17255 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Reporter: Chris M. Hostetter >Priority: Major > > If you try to use {{SolrParams.toLocalParamsString}} where some of your param > values are {{$other_param}} style param references, those refs will wind up > wrapped in single quotes, preventing the param de-referencing from working. > Example... > {code:java} > final ModifiableSolrParams params = new ModifiableSolrParams(); > params.set("type", "edismax"); > params.set("v","$other_param"); > System.out.println(params.toLocalParamString()) > // Output: {! type=edismax v='$other_param'} > {code} > Ironically: {{ClientUtils.encodeLocalParamVal}} actually has a check to see > if the string starts with {{"$"}} which causes it to bypass a loop that > checks to see if the string needs to be quoted – but bypassing that loop > doesn't leave the method variables ({{{}i{}}} and {{{}len{}}}) in a state > that allow the subsequent short-circut check (which returns the original > value) to kick in – so the value is always falls through to the {{// We need > to enclose in quotes... but now we need to escape}} logic > (It looks like this bug has always existed in every version of these methods) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Comment Edited] (SOLR-12813) SolrCloud + 2 shards + subquery + auth = 401 Exception
[ https://issues.apache.org/jira/browse/SOLR-12813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840566#comment-17840566 ] David Smiley edited comment on SOLR-12813 at 4/24/24 8:23 PM: -- When using basic auth, is this a reference to BasicAuthPlugin, subclass of AuthenticationPlugin? AFAIK, AuthenticationPlugin (of whatever type) is instrumented transparently within Solr so that Solr code usually just-works correctly. was (Author: dsmiley): When using basic auth, is this a reference to BasicAuthPlugin, subclass of AuthenticationPlugin? Secondly, does anyone know why only PKIAuthenticationPlugin instruments clients AFAIK, AuthenticationPlugin (of whatever type) is instrumented transparently within Solr so that Solr code usually just-works correctly. > SolrCloud + 2 shards + subquery + auth = 401 Exception > -- > > Key: SOLR-12813 > URL: https://issues.apache.org/jira/browse/SOLR-12813 > Project: Solr > Issue Type: Bug > Components: security, SolrCloud >Affects Versions: 6.4.1, 7.5, 8.11 >Reporter: Igor Fedoryn >Priority: Major > Attachments: screen1.png, screen2.png > > Time Spent: 40m > Remaining Estimate: 0h > > Environment: * Solr 6.4.1 > * Zookeeper 3.4.6 > * Java 1.8 > Run Zookeeper > Upload simple configuration wherein the Solr schema has fields for a > relationship between parent/child > Run two Solr instance (2 nodes) > Create the collection with 1 shard on each Solr nodes > > Add parent document to one shard and child document to another shard. > The response for: * > /select?q=ChildIdField:VALUE=*,parents:[subqery]=\{!term f=id > v=$row.ParentIdsField} > correct. > > After that add Basic Authentication with some user for collection. > Restart Solr or reload Solr collection. > If the simple request /select?q=*:* with authorization on Solr server is a > success then run previously request > with authorization on Solr server and you get the exception: "Solr HTTP > error: Unauthorized (401) " > > Screens in the attachment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-12813) SolrCloud + 2 shards + subquery + auth = 401 Exception
[ https://issues.apache.org/jira/browse/SOLR-12813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840566#comment-17840566 ] David Smiley commented on SOLR-12813: - When using basic auth, is this a reference to BasicAuthPlugin, subclass of AuthenticationPlugin? Secondly, does anyone know why only PKIAuthenticationPlugin instruments clients AFAIK, AuthenticationPlugin (of whatever type) is instrumented transparently within Solr so that Solr code usually just-works correctly. > SolrCloud + 2 shards + subquery + auth = 401 Exception > -- > > Key: SOLR-12813 > URL: https://issues.apache.org/jira/browse/SOLR-12813 > Project: Solr > Issue Type: Bug > Components: security, SolrCloud >Affects Versions: 6.4.1, 7.5, 8.11 >Reporter: Igor Fedoryn >Priority: Major > Attachments: screen1.png, screen2.png > > Time Spent: 40m > Remaining Estimate: 0h > > Environment: * Solr 6.4.1 > * Zookeeper 3.4.6 > * Java 1.8 > Run Zookeeper > Upload simple configuration wherein the Solr schema has fields for a > relationship between parent/child > Run two Solr instance (2 nodes) > Create the collection with 1 shard on each Solr nodes > > Add parent document to one shard and child document to another shard. > The response for: * > /select?q=ChildIdField:VALUE=*,parents:[subqery]=\{!term f=id > v=$row.ParentIdsField} > correct. > > After that add Basic Authentication with some user for collection. > Restart Solr or reload Solr collection. > If the simple request /select?q=*:* with authorization on Solr server is a > success then run previously request > with authorization on Solr server and you get the exception: "Solr HTTP > error: Unauthorized (401) " > > Screens in the attachment. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Created] (SOLR-17243) CloudSolrClient should support SolrRequest.getBasePath (a URL)
David Smiley created SOLR-17243: --- Summary: CloudSolrClient should support SolrRequest.getBasePath (a URL) Key: SOLR-17243 URL: https://issues.apache.org/jira/browse/SOLR-17243 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: SolrJ Reporter: David Smiley Sometimes you have a CloudSolrClient and you have a use-case where you want to send a request to a specific node / URL. Perhaps uncommon for typical users but it's common internally (within Solr) to encounter this. SolrRequest.setBasePath / getBasePath (which is actually a URL!) is already here for this but it only works on an Http SolrClient. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17106) LBSolrClient: Make it configurable to remove zombie ping checks
[ https://issues.apache.org/jira/browse/SOLR-17106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837877#comment-17837877 ] David Smiley commented on SOLR-17106: - Just want to mention that we should probably have these settings choose default values via EnvUtils (a new thing) so we can conveniently make these settings adjustments via system properties or env vars, whichever is convenient to the user. > LBSolrClient: Make it configurable to remove zombie ping checks > --- > > Key: SOLR-17106 > URL: https://issues.apache.org/jira/browse/SOLR-17106 > Project: Solr > Issue Type: Improvement >Reporter: Aparna Suresh >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > Following discussion from a dev list discussion here: > [https://lists.apache.org/thread/f0zfmpg0t48xrtppyfsmfc5ltzsq2qqh] > The issue involves scalability challenges in SolrJ's *LBSolrClient* when a > pod with numerous cores experiences connectivity problems. The "zombie" > tracking mechanism, operating on a core basis, becomes a bottleneck during > distributed search on a massive multi shard collection. Threads attempting to > reach unhealthy cores contribute to a high computational load, causing > performance issues. > As suggested by Chris Hostetter: LBSolrClient could be configured to disable > zombie "ping" checks, but retain zombie tracking. Once a replica/endpoint is > identified as a zombie, it could be held in zombie jail for X seconds, before > being released - hoping that by this timeframe ZK would be updated to mark > this endpoint DOWN or the pod is back up and CloudSolrClient would avoid > querying it. In any event, only 1 failed query would be needed to send the > server back to zombie jail. > > There are benefits in doing this change: > * Eliminate the zombie ping requests, which would otherwise overload pod(s) > coming up after a restart > * Avoid memory leaks, in case a node/replica goes away permanently, but it > stays as zombie forever, with a background thread in LBSolrClient constantly > pinging it -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17204) REPLACENODE command does not support source not being live
[ https://issues.apache.org/jira/browse/SOLR-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17204. - Fix Version/s: 9.6.0 Resolution: Fixed I reclassified this as an improvement because I think a user will mostly perceive it this way. Thanks for contributing Vincent! > REPLACENODE command does not support source not being live > -- > > Key: SOLR-17204 > URL: https://issues.apache.org/jira/browse/SOLR-17204 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Vincent Primault >Priority: Minor > Fix For: 9.6.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > The REPLACENODE command explicitly does not support the source node not being > live, and fails at the beginning if it is not live. However, later on, it > also explicitly supports that same source node not being live. > In most situations, except if a shard has a single replica being on the down > source node, the unavailability of the source node should not be problematic. > I therefore propose to support running the REPLACENODE command with the > source node not being live. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17204) REPLACENODE command does not support source not being live
[ https://issues.apache.org/jira/browse/SOLR-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17204: Issue Type: Improvement (was: Bug) > REPLACENODE command does not support source not being live > -- > > Key: SOLR-17204 > URL: https://issues.apache.org/jira/browse/SOLR-17204 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Vincent Primault >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > The REPLACENODE command explicitly does not support the source node not being > live, and fails at the beginning if it is not live. However, later on, it > also explicitly supports that same source node not being live. > In most situations, except if a shard has a single replica being on the down > source node, the unavailability of the source node should not be problematic. > I therefore propose to support running the REPLACENODE command with the > source node not being live. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-10654) Expose Metrics in Prometheus format DIRECTLY from Solr
[ https://issues.apache.org/jira/browse/SOLR-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836368#comment-17836368 ] David Smiley commented on SOLR-10654: - Cool to see [your PR|https://github.com/apache/solr/pull/2375] Mathew :-) You didn't title that PR starting with this JIRA issue, so it didn't auto-link. Also it's good to comment in the JIRA issue about a PR any way since there is no notification whatsoever to those watching this JIRA issue even when the link is auto-added. Disclaimer: I haven't been looking closely at metrics lately. A parallel registry seems painful -- overhead and the synchronization risks. Moving to Micrometer -- do you think it would affect most metrics publishers in Solr (thus touch tons of source files) or only the metrics internals/plumbing? Either way, probably for Solr 10 if we go this way. Maybe there could be a hard-coded algorithmic approach that can convert the raw name to a tagged/labelled one metric? > Expose Metrics in Prometheus format DIRECTLY from Solr > -- > > Key: SOLR-10654 > URL: https://issues.apache.org/jira/browse/SOLR-10654 > Project: Solr > Issue Type: Improvement > Components: metrics >Reporter: Keith Laban >Priority: Major > Attachments: prometheus_metrics.txt > > Time Spent: 10m > Remaining Estimate: 0h > > Expose metrics via a `wt=prometheus` response type. > Example scape_config in prometheus.yml: > {code} > scrape_configs: > - job_name: 'solr' > metrics_path: '/solr/admin/metrics' > params: > wt: ["prometheus"] > static_configs: > - targets: ['localhost:8983'] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-8393) Component for Solr resource usage planning
[ https://issues.apache.org/jira/browse/SOLR-8393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836360#comment-17836360 ] David Smiley commented on SOLR-8393: What about Solr's metrics API; has that been explored as a solution to the problem/need? We can consider adding more metrics. I don't see why a query should return core metrics. > Component for Solr resource usage planning > -- > > Key: SOLR-8393 > URL: https://issues.apache.org/jira/browse/SOLR-8393 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy >Priority: Major > Attachments: SOLR-8393-1.patch, SOLR-8393.patch, SOLR-8393.patch, > SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, > SOLR-8393.patch, SOLR-8393.patch, SOLR-8393.patch, SOLR-8393_tag_7.5.0.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > One question that keeps coming back is how much disk and RAM do I need to run > Solr. The most common response is that it highly depends on your data. While > true, it makes for frustrated users trying to plan their deployments. > The idea I'm bringing is to create a new component that will attempt to > extrapolate resources needed in the future by looking at resources currently > used. By adding a parameter for the target number of documents, current > resources are adapted by a ratio relative to current number of documents. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-12404) Start using HTTP/2 instead of HTTP/1.1.
[ https://issues.apache.org/jira/browse/SOLR-12404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-12404. - Resolution: Fixed Closing as completed by other work items. > Start using HTTP/2 instead of HTTP/1.1. > --- > > Key: SOLR-12404 > URL: https://issues.apache.org/jira/browse/SOLR-12404 > Project: Solr > Issue Type: Improvement >Reporter: Mark Miller >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17189) DockMakerTest.testRealisticUnicode fails from whitespace assumption
[ https://issues.apache.org/jira/browse/SOLR-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17189: Fix Version/s: 9.6.0 > DockMakerTest.testRealisticUnicode fails from whitespace assumption > --- > > Key: SOLR-17189 > URL: https://issues.apache.org/jira/browse/SOLR-17189 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Components: benchmarks >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Fix For: 9.6.0 > > Time Spent: 1h > Remaining Estimate: 0h > > DockMakerTest.testRealisticUnicode fails 1-2% of the time -- > [link|https://ge.apache.org/scans/tests?search.timeZoneId=America%2FNew_York=org.apache.solr.bench.DockMakerTest=testRealisticUnicode]. > {quote}java.lang.AssertionError: expected:<6> but was:<7> > at __randomizedtesting.SeedInfo.seed([C5136F274AFF3ADD:95FFEBF499446D74]:0) > ••• > at > org.apache.solr.bench.DockMakerTest.testRealisticUnicode(DockMakerTest.java:189){quote} > It seems clear it's because it assumes that the "realistic unicode" chars > won't match the regexp: {{\s}}. A single space char is used to join the > words but maybe this or other whitespace chars are in those unicode codepoint > blocks. > Additionally, it's frustrating that this particular benchmark framework > doesn't honor tests.seed in its generation of random data and thus it's hard > to reproduce the failure. That ought to be fixed as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17181) Performance degradation matching glob patterns for fields
[ https://issues.apache.org/jira/browse/SOLR-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835745#comment-17835745 ] David Smiley commented on SOLR-17181: - Next time, when squash-merging, please cleanup/editorialize the default commit message. It always needs some love. Please edit the proper "Fix Version" when resolving an issue. > Performance degradation matching glob patterns for fields > - > > Key: SOLR-17181 > URL: https://issues.apache.org/jira/browse/SOLR-17181 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query, search >Affects Versions: 9.5.0 >Reporter: Justin Sweeney >Assignee: Justin Sweeney >Priority: Major > Labels: fl, performance > Time Spent: 2h 40m > Remaining Estimate: 0h > > This ticket: https://issues.apache.org/jira/browse/SOLR-17022 seems to have > cause some performance degradation when matching glob patterns to fields as > noted in this thread: > https://lists.apache.org/thread/vbwnjxprl6s1qy0t1jzfcw8hprg1gvzh. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17211) HttpJdkSolrClient: Support Async
[ https://issues.apache.org/jira/browse/SOLR-17211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17211: Fix Version/s: 9.6.0 > HttpJdkSolrClient: Support Async > - > > Key: SOLR-17211 > URL: https://issues.apache.org/jira/browse/SOLR-17211 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Reporter: James Dyer >Assignee: James Dyer >Priority: Minor > Fix For: 9.6.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > With SOLR-599 we added a new SolrJ client *HttpJdkSolrClient* which uses > *java.net.http.HttpClient* internally. This JDK client supports asynchronous > requests, so the Jdk Solr Client likewise can have async support. This > ticket is to: > 1. Extract from *Http2SolrClient* method > {code:java} > public Cancellable asyncRequest( > SolrRequest solrRequest, > String collection, > AsyncListener> asyncListener) > {code} > 2. Implement on *HttpJdkSolrClient* > 3. Add javadoc for both clients. > 4. Add unit tests for both clients. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Updated] (SOLR-17206) Update requests to SolrCloud can return status value of -1
[ https://issues.apache.org/jira/browse/SOLR-17206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated SOLR-17206: Fix Version/s: 9.6.0 > Update requests to SolrCloud can return status value of -1 > -- > > Key: SOLR-17206 > URL: https://issues.apache.org/jira/browse/SOLR-17206 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 9.4 >Reporter: Paul McArthur >Priority: Minor > Fix For: 9.6.0 > > Time Spent: 20m > Remaining Estimate: 0h > > It is possible for SolrCloud to return a HTTP status code of -1 for an update > request that is distributed, if all of the distributed commands fail with an > exception that is not a SolrException (e.g. an IOException). > > SorlCmdDistributor.SolrError sets a default value of -1 for the status code > response for the distributed request. If a SolrException is encountered, the > status code is replaced with the code from the response. If any other > exception type is raised, the status code remains as -1. > DistributedUpdatesAsyncException analyzes the errors from all distributed > commands to determine the overall status code for the response. If all status > codes equal -1, then -1 is returned as the status. > > The code should correspond to a valid HTTP status code, in this case, 500. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17153) CloudSolrClient should not throw "Collection not found" with an out-dated ClusterState
[ https://issues.apache.org/jira/browse/SOLR-17153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17153. - Fix Version/s: 9.6.0 Resolution: Fixed > CloudSolrClient should not throw "Collection not found" with an out-dated > ClusterState > -- > > Key: SOLR-17153 > URL: https://issues.apache.org/jira/browse/SOLR-17153 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Reporter: David Smiley >Priority: Major > Fix For: 9.6.0 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > Today, CloudSolrClient will locally fail if it's asked to send a request to a > collection that it thinks does not exist due to its local ClusterState view > being out-of-date. We shouldn't fail! And most SolrCloud tests should then > remove their waitForState calls that follow collection creation! Other stale > state matters are out-of-scope. > Proposal: CloudSolrClient shouldn't try and be too smart. Always route a > request to Solr (any node); don't presume its state is up-to-date. Maybe, > after a response is received, it can check if its state has been updated and > if not then explicitly get a new state. Or not if that's too complicated. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17206) Update requests to SolrCloud can return status value of -1
[ https://issues.apache.org/jira/browse/SOLR-17206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17206. - Resolution: Fixed Thanks for contributing! > Update requests to SolrCloud can return status value of -1 > -- > > Key: SOLR-17206 > URL: https://issues.apache.org/jira/browse/SOLR-17206 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 9.4 >Reporter: Paul McArthur >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > It is possible for SolrCloud to return a HTTP status code of -1 for an update > request that is distributed, if all of the distributed commands fail with an > exception that is not a SolrException (e.g. an IOException). > > SorlCmdDistributor.SolrError sets a default value of -1 for the status code > response for the distributed request. If a SolrException is encountered, the > status code is replaced with the code from the response. If any other > exception type is raised, the status code remains as -1. > DistributedUpdatesAsyncException analyzes the errors from all distributed > commands to determine the overall status code for the response. If all status > codes equal -1, then -1 is returned as the status. > > The code should correspond to a valid HTTP status code, in this case, 500. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-17189) DockMakerTest.testRealisticUnicode fails from whitespace assumption
[ https://issues.apache.org/jira/browse/SOLR-17189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-17189. - Assignee: David Smiley Resolution: Fixed > DockMakerTest.testRealisticUnicode fails from whitespace assumption > --- > > Key: SOLR-17189 > URL: https://issues.apache.org/jira/browse/SOLR-17189 > Project: Solr > Issue Type: Test > Security Level: Public(Default Security Level. Issues are Public) > Components: benchmarks >Reporter: David Smiley >Assignee: David Smiley >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > DockMakerTest.testRealisticUnicode fails 1-2% of the time -- > [link|https://ge.apache.org/scans/tests?search.timeZoneId=America%2FNew_York=org.apache.solr.bench.DockMakerTest=testRealisticUnicode]. > {quote}java.lang.AssertionError: expected:<6> but was:<7> > at __randomizedtesting.SeedInfo.seed([C5136F274AFF3ADD:95FFEBF499446D74]:0) > ••• > at > org.apache.solr.bench.DockMakerTest.testRealisticUnicode(DockMakerTest.java:189){quote} > It seems clear it's because it assumes that the "realistic unicode" chars > won't match the regexp: {{\s}}. A single space char is used to join the > words but maybe this or other whitespace chars are in those unicode codepoint > blocks. > Additionally, it's frustrating that this particular benchmark framework > doesn't honor tests.seed in its generation of random data and thus it's hard > to reproduce the failure. That ought to be fixed as well. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17221) Http2SolrClient merges case sensitive solr params
[ https://issues.apache.org/jira/browse/SOLR-17221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17834127#comment-17834127 ] David Smiley commented on SOLR-17221: - Yeah that's a bug! PR welcome. > Http2SolrClient merges case sensitive solr params > - > > Key: SOLR-17221 > URL: https://issues.apache.org/jira/browse/SOLR-17221 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 9.5.0 >Reporter: Yue Yu >Priority: Major > > In solr9.5.0/solrj9.5.0, the multi-shard requests are sent through > Http2SolrClient, and this function composes the actual Jetty Request object: > {code:java} > private Request fillContentStream( > Request req, > Collection streams, > ModifiableSolrParams wparams, > boolean isMultipart) > throws IOException { > if (isMultipart) { > // multipart/form-data > try (MultiPartRequestContent content = new MultiPartRequestContent()) { > Iterator iter = wparams.getParameterNamesIterator(); > while (iter.hasNext()) { > String key = iter.next(); > String[] vals = wparams.getParams(key); > if (vals != null) { > for (String val : vals) { > content.addFieldPart(key, new StringRequestContent(val), null); > } > } > } > if (streams != null) { > for (ContentStream contentStream : streams) { > String contentType = contentStream.getContentType(); > if (contentType == null) { > contentType = "multipart/form-data"; // default > } > String name = contentStream.getName(); > if (name == null) { > name = ""; > } > HttpFields.Mutable fields = HttpFields.build(1); > fields.add(HttpHeader.CONTENT_TYPE, contentType); > content.addFilePart( > name, > contentStream.getName(), > new InputStreamRequestContent(contentStream.getStream()), > fields); > } > } > req.body(content); > } > } else { > // application/x-www-form-urlencoded > Fields fields = new Fields(); > Iterator iter = wparams.getParameterNamesIterator(); > while (iter.hasNext()) { > String key = iter.next(); > String[] vals = wparams.getParams(key); > if (vals != null) { > for (String val : vals) { > fields.add(key, val); > } > } > } > req.body(new FormRequestContent(fields, FALLBACK_CHARSET)); > } > return req; > } {code} > The problem is the use of this class *Fields fields = new Fields();* where > caseSensitive=false by default, this leads to case sensitive solr params > being merged together. For example f.case_sensitive_field.facet.limit=5 & > f.CASE_SENSITIVE_FIELD.facet.limit=99 > Not sure if this is intentional for some reason? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17158) Terminate distributed processing quickly when query limit is reached
[ https://issues.apache.org/jira/browse/SOLR-17158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833722#comment-17833722 ] David Smiley commented on SOLR-17158: - Fail vs Success should be based on {{shards.tolerant}} -- https://solr.apache.org/guide/solr/latest/deployment-guide/solrcloud-distributed-requests.html#shards-tolerant-parameter > Terminate distributed processing quickly when query limit is reached > > > Key: SOLR-17158 > URL: https://issues.apache.org/jira/browse/SOLR-17158 > Project: Solr > Issue Type: Sub-task > Components: Query Limits >Reporter: Andrzej Bialecki >Assignee: Gus Heck >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > Solr should make sure that when query limits are reached and partial results > are not needed (and not wanted) then both the processing in shards and in the > query coordinator should be terminated as quickly as possible, and Solr > should minimize wasted resources spent on eg. returning data from the > remaining shards, merging responses in the coordinator, or returning any data > back to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17205) De-couple SolrJ required Java version from server Java version (main)
[ https://issues.apache.org/jira/browse/SOLR-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833703#comment-17833703 ] David Smiley commented on SOLR-17205: - While I'm supportive of this change, a user can continue to use a SolrJ for the previous release. We might even invest in some changes there for such users. > De-couple SolrJ required Java version from server Java version (main) > - > > Key: SOLR-17205 > URL: https://issues.apache.org/jira/browse/SOLR-17205 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Reporter: Jan Høydahl >Assignee: Jan Høydahl >Priority: Major > Fix For: main (10.0) > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Solr 9.x requires Java 11, both for server and solrj client. > In Solr 10 we will likely bump required java version to Java 17, or maybe > even 21, and since we are a standalone app we can do that - on the > server-side. > However, to give SolrJ client a broadest possible compatibility with customer > application environments, we should consider de-coupling SolrJ's java > requirement from the server-side. That would allow us to be progressive on > the server side Java without forcing users to stay on latest Java in their > apps. > I don't know if it makes much sense to be compatible too far back on EOL java > versions, but perhaps let SolrJ stay one LTS version behind the server for > broad compatibility. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-13350) Explore collector managers for multi-threaded search
[ https://issues.apache.org/jira/browse/SOLR-13350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832928#comment-17832928 ] David Smiley commented on SOLR-13350: - Do we have a limited queue for the pool, leading to the "RejectedExecution"? I think we'd like a caller-runs policy so we don't wait & starve > Explore collector managers for multi-threaded search > > > Key: SOLR-13350 > URL: https://issues.apache.org/jira/browse/SOLR-13350 > Project: Solr > Issue Type: New Feature >Reporter: Ishan Chattopadhyaya >Assignee: Ishan Chattopadhyaya >Priority: Major > Attachments: SOLR-13350.patch, SOLR-13350.patch, SOLR-13350.patch > > Time Spent: 7h 10m > Remaining Estimate: 0h > > AFAICT, SolrIndexSearcher can be used only to search all the segments of an > index in series. However, using CollectorManagers, segments can be searched > concurrently and result in reduced latency. Opening this issue to explore the > effectiveness of using CollectorManagers in SolrIndexSearcher from latency > and throughput perspective. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-14763) SolrJ Client Async HTTP/2 Requests
[ https://issues.apache.org/jira/browse/SOLR-14763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832434#comment-17832434 ] David Smiley commented on SOLR-14763: - This missed 9.0 sadly; I suppose we could add an adapter method for the current signature. > SolrJ Client Async HTTP/2 Requests > -- > > Key: SOLR-14763 > URL: https://issues.apache.org/jira/browse/SOLR-14763 > Project: Solr > Issue Type: Improvement > Components: SolrJ >Affects Versions: 8.7 >Reporter: Rishi Sankar >Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > > In SOLR-14354, [~caomanhdat] created an API to use Jetty async API to make > more thread efficient HttpShardHandler requests. This added public async > request APIs to Http2SolrClient and LBHttp2SolrClient. There are a few ways > this API can be improved, that I will track in this issue: > 1) Using a CompletableFuture-based async API signature, instead of using > internal custom interfaces (Cancellable, AsyncListener) - based on [this > discussion|https://lists.apache.org/thread.html/r548f318d9176c84ad1a4ed49ff182eeea9f82f26cb23e372244c8a23%40%3Cdev.lucene.apache.org%3E]. > 2) An async API is also useful in other HTTP/2 Solr clients as well, > particularly CloudHttp2SolrClient (SOLR-14675). I will add a requestAsync > method to the SolrClient class, with a default method that initially throws > an unsupported operation exception (maybe this can be later updated to use an > executor to handle the async request as a default impl). For now, I'll > override the default implementation in the Http2SolrClient and > CloudHttp2SolrClient. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Commented] (SOLR-17211) HttpJdkSolrClient: Support Async
[ https://issues.apache.org/jira/browse/SOLR-17211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832433#comment-17832433 ] David Smiley commented on SOLR-17211: - James, please see SOLR-14763 which suggests an API that removes Solr's needless bespoke interface in place of JDK provided CompletableFuture. There's a PR there too... sadly the contributor and I both lost sight of it but there it sits anyway. > HttpJdkSolrClient: Support Async > - > > Key: SOLR-17211 > URL: https://issues.apache.org/jira/browse/SOLR-17211 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Reporter: James Dyer >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > With SOLR-599 we added a new SolrJ client *HttpJdkSolrClient* which uses > *java.net.http.HttpClient* internally. This JDK client supports asynchronous > requests, so the Jdk Solr Client likewise can have async support. This > ticket is to: > 1. Extract from *Http2SolrClient* method > {code:java} > public Cancellable asyncRequest( > SolrRequest solrRequest, > String collection, > AsyncListener> asyncListener) > {code} > 2. Implement on *HttpJdkSolrClient* > 3. Add javadoc for both clients. > 4. Add unit tests for both clients. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org
[jira] [Resolved] (SOLR-14892) shards.info with shards.tolerant can yield an empty key
[ https://issues.apache.org/jira/browse/SOLR-14892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved SOLR-14892. - Fix Version/s: 9.6.0 Resolution: Fixed Thanks for contributing Mathieu! > shards.info with shards.tolerant can yield an empty key > --- > > Key: SOLR-14892 > URL: https://issues.apache.org/jira/browse/SOLR-14892 > Project: Solr > Issue Type: Bug > Components: search >Reporter: David Smiley >Priority: Minor > Fix For: 9.6.0 > > Attachments: solr14892.png > > Time Spent: 2h 20m > Remaining Estimate: 0h > > When using shards.tolerant=true and shards.info=true when a shard isn't > available (and maybe other circumstances), the shards.info section of the > response may have an empty-string key child with a value that is ambiguous as > to which shard(s) couldn't be reached. > This problem can be revealed by modifying > org.apache.solr.cloud.TestDownShardTolerantSearch#searchingShouldFailWithoutTolerantSearchSetToTrue > to add shards.info and then examine the response in a debugger. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org