[
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17892962#comment-17892962
]
Sanjay Dutt commented on SOLR-17515:
------------------------------------
Thank you so much [~gerlowskija] for reproducing it and providing all the
details. Though I am bit confused with all the different auth mechanism we have
in place. Even last time two auth cases found for which new test case were
added. Clearly, more test cases are required. Going to work on this one unless
you are already on it.
> Recovery fails in Solr 9.7.0 if basic-auth is enabled
> -----------------------------------------------------
>
> Key: SOLR-17515
> URL: https://issues.apache.org/jira/browse/SOLR-17515
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 9.7
> Reporter: Jason Gerlowski
> Priority: Major
>
> Several reporters on the users@ list, recently shared a bug they noticed on
> upgrading to Solr 9.7. Replicas would try to recover, but fail with a
> NullPointerException:
> {code}
> 2024-09-18 09:36:31.238 ERROR
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr
> dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts
> s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:]
> o.a.s.c.RecoveryStrategy Error while trying to recover.
> core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot
> invoke
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
> because "this.authenticationStore" is null
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> java.lang.NullPointerException: Cannot invoke
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
> because "this.authenticationStore" is null
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
> ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
> ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
> ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
> ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 -
> anshum - 2024-09-03 15:05:20]
> at
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333)
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
> - 2024-09-03 15:05:20]
> at
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309)
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum
> - 2024-09-03 15:05:20]
> ...
> {code}
> It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0
> cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on
> during replica recovery*. The result is that replicas will fail to recover,
> and sit marked as "recovering" indefinitely.
> The issue can be reproduced locally in a source-checkout using the following
> steps:
> {code}
> git checkout branch_9_7
> ./gradlew clean assemble
> cd solr/packaging/build/solr-9.7.0-SNAPSHOT
> # At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas,
> "_default" configset
> bin/solr start -e cloud
> bin/solr post -c gettingstarted example/exampledocs/books.json
> # Stop the node containing the non-leader replica
> bin/solr stop -p <port>
> bin/solr post -c gettingstarted example/exampledocs/books.csv
> # Enable auth and trigger recovery by turning the node back on
> bin/solr auth enable -type basicAuth -credentials solr:solrRocks
> -blockUnknown true
> # This line will need tweaked based on which Solr node was previously stopped
> "bin/solr" start --cloud -p <port> -s "example/cloud/<node>/solr" -z
> 127.0.0.1:9983
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]