[
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Gerlowski updated SOLR-17515:
-----------------------------------
Description:
Several reporters on the users@ list, recently shared a bug they noticed on
upgrading to Solr 9.7. Replicas would try to recover, but fail with a
NullPointerException:
{code}
2024-09-18 09:36:31.238 ERROR
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:]
o.a.s.c.RecoveryStrategy Error while trying to recover.
core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot
invoke
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
because "this.authenticationStore" is null
at
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
java.lang.NullPointerException: Cannot invoke
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
because "this.authenticationStore" is null
at
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333)
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum -
2024-09-03 15:05:20]
at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309)
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum -
2024-09-03 15:05:20]
...
{code}
It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0
cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on
during replica recovery*. The result is that replicas will fail to recover,
and sit marked as "recovering" indefinitely.
The issue can be reproduced locally in a source-checkout using the following
steps:
{code}
git checkout branch_9_7
./gradlew clean assemble
cd solr/packaging/build/solr-9.7.0-SNAPSHOT
# At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas,
"_default" configset
bin/solr start -e cloud
bin/solr post -c gettingstarted example/exampledocs/books.json
# Stop the node containing the non-leader replica
bin/solr stop -p <port>
bin/solr post -c gettingstarted example/exampledocs/books.csv
# Enable auth and trigger recovery by turning the node back on
bin/solr auth enable -type basicAuth -credentials solr:solrRocks -blockUnknown
true
# This line will need tweaked based on which Solr node was previously stopped
"bin/solr" start --cloud -p <port> -s "example/cloud/<node>/solr" -z
127.0.0.1:9983
{code}
was:
Several reporters on the users@ list, recently shared a bug they noticed on
upgrading to Solr 9.7. Replicas would try to recover, but fail with a
NullPointerException:
{code}
2024-09-18 09:36:31.238 ERROR
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:]
o.a.s.c.RecoveryStrategy Error while trying to recover.
core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot
invoke
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
because "this.authenticationStore" is null
at
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
java.lang.NullPointerException: Cannot invoke
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
because "this.authenticationStore" is null
at
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333)
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum -
2024-09-03 15:05:20]
at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309)
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum -
2024-09-03 15:05:20]
at
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:212)
~[metrics-core-4.2.26.jar:4.2.26]
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
~[?:?]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
~[?:?]
at
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
- 2024-09-03 15:05:20]
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
~[?:?]
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
~[?:?]
at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
2024-09-18 09:36:31.238 ERROR
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:]
o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
2024-09-18 09:36:31.238 INFO
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:]
o.a.s.c.RecoveryStrategy Wait [4] seconds before trying to recover again
(attempt=1)
{code}
It turns out that the issue isn't specific to upgrading clusters: any 9.7.0
cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on
during replica recovery. The result is that replicas will fail to recover, and
sit marked as "recovering" indefinitely.
The issue can be reproduced locally in a source-checkout using the following
steps:
{code}
git checkout branch_9_7
./gradlew clean assemble
cd solr/packaging/build/solr-9.7.0-SNAPSHOT
# At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas,
"_default" configset
bin/solr start -e cloud
bin/solr post -c gettingstarted example/exampledocs/books.json
# Stop the node containing the non-leader replica
bin/solr stop -p <port>
bin/solr post -c gettingstarted example/exampledocs/books.csv
# Enable auth and trigger recovery by turning the node back on
bin/solr auth enable -type basicAuth -credentials solr:solrRocks -blockUnknown
true
# This line will need tweaked based on which Solr node was previously stopped
"bin/solr" start --cloud -p <port> -s "example/cloud/<node>/solr" -z
127.0.0.1:9983
{code}
> Recovery fails in Solr 9.7.0 if basic-auth is enabled
> -----------------------------------------------------
>
> Key: SOLR-17515
> URL: https://issues.apache.org/jira/browse/SOLR-17515
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 9.7
> Reporter: Jason Gerlowski
> Priority: Major
>
> Several reporters on the users@ list, recently shared a bug they noticed on
> upgrading to Solr 9.7. Replicas would try to recover, but fail with a
> NullPointerException:
> {code}
> 2024-09-18 09:36:31.238 ERROR
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr
> dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts
> s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:]
> o.a.s.c.RecoveryStrategy Error while trying to recover.
> core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot
> invoke
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
> because "this.authenticationStore" is null
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> java.lang.NullPointerException: Cannot invoke
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
> because "this.authenticationStore" is null
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
> ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
> ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
> ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
> ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333)
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
> - 2024-09-03 15:05:20]
> at
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309)
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
> - 2024-09-03 15:05:20]
> ...
> {code}
> It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0
> cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on
> during replica recovery*. The result is that replicas will fail to recover,
> and sit marked as "recovering" indefinitely.
> The issue can be reproduced locally in a source-checkout using the following
> steps:
> {code}
> git checkout branch_9_7
> ./gradlew clean assemble
> cd solr/packaging/build/solr-9.7.0-SNAPSHOT
> # At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas,
> "_default" configset
> bin/solr start -e cloud
> bin/solr post -c gettingstarted example/exampledocs/books.json
> # Stop the node containing the non-leader replica
> bin/solr stop -p <port>
> bin/solr post -c gettingstarted example/exampledocs/books.csv
> # Enable auth and trigger recovery by turning the node back on
> bin/solr auth enable -type basicAuth -credentials solr:solrRocks
> -blockUnknown true
> # This line will need tweaked based on which Solr node was previously stopped
> "bin/solr" start --cloud -p <port> -s "example/cloud/<node>/solr" -z
> 127.0.0.1:9983
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]