[
https://issues.apache.org/jira/browse/SOLR-17363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jerry Chung updated SOLR-17363:
-------------------------------
Description:
Config request was submitted to update user property, but one of the replica's
version was not updated until the solr service gets restarted.
This seems to happen
* When a replica was deleted, but the request handling node created a runner
for the replica and waits for response.
* All the replicas seem to be required to be reloaded upon updating user
property, and only one replica can be reloaded at any time, so it is possible
that not all the replicas for a collection can be reloaded within the given
time (30 seconds).
Client Side
{{Caused by:
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error
from server at
[https://ip-100-65-244-110.ec2.internal:8983/solr/mycollection/config?wt=javabin&version=2:]
1 out of 30 the property overlay to be of version 61 within 30 seconds! Failed
cores:
[https://ip-100-65-216-96.ec2.internal:8983/solr/mycollection_shard2_replica_n139/]}}
{{ at
org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:920)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:576)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:533)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:386)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:352)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
fortiva.service.storage.index.CloudSolrClient.sendRequest(CloudSolrClient.java:)
~[xxx.jar:?]}}
{{ at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:898)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:826)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:234)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
on the request handling node (taken from different instance):
{{2024-07-08 17:06:39.446 ERROR (qtp1043358826-23) [c:mycollection e3 s:shard3
r:core_node30 x:mycollection_shard3_replica_n29] o.a.s.s.HttpSolrCall 500
Exception => org.apache.solr.common.SolrException: 1 out of 29 the property
overlay to be of version 3 within 30 seconds! Failed cores:
[https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/]
at
org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895)
org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be of
version 3 within 30 seconds! Failed cores:
[https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/]
at
org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
at
org.apache.solr.handler.SolrConfigHandler$Command.handleCommands(SolrConfigHandler.java:594)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
at
org.apache.solr.handler.SolrConfigHandler$Command.handlePOST(SolrConfigHandler.java:407)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
at
org.apache.solr.handler.SolrConfigHandler.handleRequestBody(SolrConfigHandler.java:146)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2901)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
on the node where the replica was hosted (taken from the same time as above):
{{2024-07-08 17:06:09.620 INFO (qtp1043358826-125) [
x:mycollection_shard8_replica_n75] c.p.s.s.FSCryptExecutor Policy
9deb2399f8f4d4964c6e867a08f90b2f for
/data/solr/data/mycollection_shard8_replica_n75/data was removed}}
{{{{}}{}}}2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [
x:mycollection_shard8_replica_n75] o.a.s.c.SolrCore CLOSING SolrCore
org.apache.solr.core.SolrCore@75e20647 mycollection_shard8_replica_n75
2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [
x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric
reporters for registry=solr.core.mycollection.shard8.replica_n75
tag=SolrCore@75e20647 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [
x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric
reporters for registry=solr.collection.mycollection.shard8.leader
tag=SolrCore@75e20647 2024-07-08 17:06:09.627 INFO (qtp1043358826-125) [
x:mycollection_shard8_replica_n75] o.a.s.u.DirectUpdateHandler2 Committing on
IndexWriter.close() ... SKIPPED (unnecessary).
was:
Config request was submitted to update user property, but one of the replica's
version was not updated until the solr service gets restarted. This seems to
happen when a replica was deleted, but the request handling node created a
runner for the replica and waits for response.
Client Side
{{Caused by:
org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException: Error
from server at
[https://ip-100-65-244-110.ec2.internal:8983/solr/mycollection/config?wt=javabin&version=2:]
1 out of 30 the property overlay to be of version 61 within 30 seconds! Failed
cores:
[https://ip-100-65-216-96.ec2.internal:8983/solr/mycollection_shard2_replica_n139/]}}
{{ at
org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:920)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:576)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:533)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:386)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:352)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
fortiva.service.storage.index.CloudSolrClient.sendRequest(CloudSolrClient.java:)
~[xxx.jar:?]}}
{{ at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:898)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:826)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
{{ at
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:234)
~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
on the request handling node (taken from different instance):
{{2024-07-08 17:06:39.446 ERROR (qtp1043358826-23) [c:mycollection e3 s:shard3
r:core_node30 x:mycollection_shard3_replica_n29] o.a.s.s.HttpSolrCall 500
Exception => org.apache.solr.common.SolrException: 1 out of 29 the property
overlay to be of version 3 within 30 seconds! Failed cores:
[https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/]
at
org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895)
org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be of
version 3 within 30 seconds! Failed cores:
[https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/]
at
org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
at
org.apache.solr.handler.SolrConfigHandler$Command.handleCommands(SolrConfigHandler.java:594)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
at
org.apache.solr.handler.SolrConfigHandler$Command.handlePOST(SolrConfigHandler.java:407)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
at
org.apache.solr.handler.SolrConfigHandler.handleRequestBody(SolrConfigHandler.java:146)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2901)
~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
stillalex - 2023-10-10 19:10:39]}}
on the node where the replica was hosted (taken from the same time as above):
{{2024-07-08 17:06:09.620 INFO (qtp1043358826-125) [
x:mycollection_shard8_replica_n75] c.p.s.s.FSCryptExecutor Policy
9deb2399f8f4d4964c6e867a08f90b2f for
/data/solr/data/mycollection_shard8_replica_n75/data was removed}}
{{{}{}}}2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [
x:mycollection_shard8_replica_n75] o.a.s.c.SolrCore CLOSING SolrCore
org.apache.solr.core.SolrCore@75e20647 mycollection_shard8_replica_n75
2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [
x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric
reporters for registry=solr.core.mycollection.shard8.replica_n75
tag=SolrCore@75e20647 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [
x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric
reporters for registry=solr.collection.mycollection.shard8.leader
tag=SolrCore@75e20647 2024-07-08 17:06:09.627 INFO (qtp1043358826-125) [
x:mycollection_shard8_replica_n75] o.a.s.u.DirectUpdateHandler2 Committing on
IndexWriter.close() ... SKIPPED (unnecessary).
> ConfigRequest fails until Solr gets restarted
> ---------------------------------------------
>
> Key: SOLR-17363
> URL: https://issues.apache.org/jira/browse/SOLR-17363
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: config-api
> Affects Versions: 9.4
> Reporter: Jerry Chung
> Priority: Major
>
> Config request was submitted to update user property, but one of the
> replica's version was not updated until the solr service gets restarted.
> This seems to happen
> * When a replica was deleted, but the request handling node created a runner
> for the replica and waits for response.
> * All the replicas seem to be required to be reloaded upon updating user
> property, and only one replica can be reloaded at any time, so it is possible
> that not all the replicas for a collection can be reloaded within the given
> time (30 seconds).
>
> Client Side
> {{Caused by:
> org.apache.solr.client.solrj.impl.BaseHttpSolrClient$RemoteSolrException:
> Error from server at
> [https://ip-100-65-244-110.ec2.internal:8983/solr/mycollection/config?wt=javabin&version=2:]
> 1 out of 30 the property overlay to be of version 61 within 30 seconds!
> Failed cores:
> [https://ip-100-65-216-96.ec2.internal:8983/solr/mycollection_shard2_replica_n139/]}}
> {{ at
> org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:920)
> ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]}}
> {{ at
> org.apache.solr.client.solrj.impl.Http2SolrClient.processErrorsAndResponse(Http2SolrClient.java:576)
> ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]}}
> {{ at
> org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:533)
> ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]}}
> {{ at
> org.apache.solr.client.solrj.impl.LBSolrClient.doRequest(LBSolrClient.java:386)
> ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]}}
> {{ at
> org.apache.solr.client.solrj.impl.LBSolrClient.request(LBSolrClient.java:352)
> ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]}}
> {{ at
> fortiva.service.storage.index.CloudSolrClient.sendRequest(CloudSolrClient.java:)
> ~[xxx.jar:?]}}
> {{ at
> org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:898)
> ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]}}
> {{ at
> org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:826)
> ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]}}
> {{ at
> org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:234)
> ~[solr-solrj-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]}}
>
> on the request handling node (taken from different instance):
>
> {{2024-07-08 17:06:39.446 ERROR (qtp1043358826-23) [c:mycollection e3
> s:shard3 r:core_node30 x:mycollection_shard3_replica_n29]
> o.a.s.s.HttpSolrCall 500 Exception => org.apache.solr.common.SolrException: 1
> out of 29 the property overlay to be of version 3 within 30 seconds! Failed
> cores:
> [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/]
> at
> org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895)
> org.apache.solr.common.SolrException: 1 out of 29 the property overlay to be
> of version 3 within 30 seconds! Failed cores:
> [https://ip-100-65-239-139.ec2.internal:8983/solr/mycollection_shard8_replica_n75/]
> at
> org.apache.solr.handler.SolrConfigHandler.waitForAllReplicasState(SolrConfigHandler.java:895)
> ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]
> at
> org.apache.solr.handler.SolrConfigHandler$Command.handleCommands(SolrConfigHandler.java:594)
> ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]
> at
> org.apache.solr.handler.SolrConfigHandler$Command.handlePOST(SolrConfigHandler.java:407)
> ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]
> at
> org.apache.solr.handler.SolrConfigHandler.handleRequestBody(SolrConfigHandler.java:146)
> ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:226)
> ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:2901)
> ~[solr-core-9.4.0.jar:9.4.0 71e101bb37497f730078d9afe1991b60d10bfe96 -
> stillalex - 2023-10-10 19:10:39]}}
>
> on the node where the replica was hosted (taken from the same time as above):
> {{2024-07-08 17:06:09.620 INFO (qtp1043358826-125) [
> x:mycollection_shard8_replica_n75] c.p.s.s.FSCryptExecutor Policy
> 9deb2399f8f4d4964c6e867a08f90b2f for
> /data/solr/data/mycollection_shard8_replica_n75/data was removed}}
> {{{{}}{}}}2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [
> x:mycollection_shard8_replica_n75] o.a.s.c.SolrCore CLOSING SolrCore
> org.apache.solr.core.SolrCore@75e20647 mycollection_shard8_replica_n75
> 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [
> x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric
> reporters for registry=solr.core.mycollection.shard8.replica_n75
> tag=SolrCore@75e20647 2024-07-08 17:06:09.622 INFO (qtp1043358826-125) [
> x:mycollection_shard8_replica_n75] o.a.s.m.SolrMetricManager Closing metric
> reporters for registry=solr.collection.mycollection.shard8.leader
> tag=SolrCore@75e20647 2024-07-08 17:06:09.627 INFO (qtp1043358826-125) [
> x:mycollection_shard8_replica_n75] o.a.s.u.DirectUpdateHandler2 Committing on
> IndexWriter.close() ... SKIPPED (unnecessary).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]