[ 
https://issues.apache.org/jira/browse/SOLR-17515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gerlowski updated SOLR-17515:
-----------------------------------
    Description: 
Several reporters on the users@ list, recently shared a bug they noticed on 
upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
NullPointerException:

{code}
2024-09-18 09:36:31.238 ERROR 
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
o.a.s.c.RecoveryStrategy Error while trying to recover. 
core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
invoke 
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
 because "this.authenticationStore" is null
        at 
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
java.lang.NullPointerException: Cannot invoke 
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
 because "this.authenticationStore" is null
        at 
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
 ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
 ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 
2024-09-03 15:05:20]
        at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 
2024-09-03 15:05:20]
        ...
{code}

It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0 
cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on 
during replica recovery*.  The result is that replicas will fail to recover, 
and sit marked as "recovering" indefinitely.

The issue can be reproduced locally in a source-checkout using the following 
steps:

{code}
git checkout branch_9_7
./gradlew clean assemble
cd solr/packaging/build/solr-9.7.0-SNAPSHOT

# At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas, 
"_default" configset
bin/solr start -e cloud

bin/solr post -c gettingstarted example/exampledocs/books.json
# Stop the node containing the non-leader replica
bin/solr stop -p <port>
bin/solr post -c gettingstarted example/exampledocs/books.csv

# Enable auth and trigger recovery by turning the node back on
bin/solr auth enable -type basicAuth -credentials solr:solrRocks -blockUnknown 
true
# This line will need tweaked based on which Solr node was previously stopped
"bin/solr" start --cloud -p <port> -s "example/cloud/<node>/solr" -z 
127.0.0.1:9983
{code}

  was:
Several reporters on the users@ list, recently shared a bug they noticed on 
upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
NullPointerException:

{code}
2024-09-18 09:36:31.238 ERROR 
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
o.a.s.c.RecoveryStrategy Error while trying to recover. 
core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
invoke 
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
 because "this.authenticationStore" is null
        at 
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
java.lang.NullPointerException: Cannot invoke 
"org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
 because "this.authenticationStore" is null
        at 
org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
 ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
 ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 
2024-09-03 15:05:20]
        at 
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum - 
2024-09-03 15:05:20]
        at 
com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:212)
 ~[metrics-core-4.2.26.jar:4.2.26]
        at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
 ~[?:?]
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) 
~[?:?]
        at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$1(ExecutorUtil.java:449)
 ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
- 2024-09-03 15:05:20]
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
 ~[?:?]
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
 ~[?:?]
        at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
2024-09-18 09:36:31.238 ERROR 
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
o.a.s.c.RecoveryStrategy Recovery failed - trying again... (0)
2024-09-18 09:36:31.238 INFO  
(recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
o.a.s.c.RecoveryStrategy Wait [4] seconds before trying to recover again 
(attempt=1)
{code}

It turns out that the issue isn't specific to upgrading clusters: any 9.7.0 
cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on 
during replica recovery.  The result is that replicas will fail to recover, and 
sit marked as "recovering" indefinitely.

The issue can be reproduced locally in a source-checkout using the following 
steps:

{code}
git checkout branch_9_7
./gradlew clean assemble
cd solr/packaging/build/solr-9.7.0-SNAPSHOT

# At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas, 
"_default" configset
bin/solr start -e cloud

bin/solr post -c gettingstarted example/exampledocs/books.json
# Stop the node containing the non-leader replica
bin/solr stop -p <port>
bin/solr post -c gettingstarted example/exampledocs/books.csv

# Enable auth and trigger recovery by turning the node back on
bin/solr auth enable -type basicAuth -credentials solr:solrRocks -blockUnknown 
true
# This line will need tweaked based on which Solr node was previously stopped
"bin/solr" start --cloud -p <port> -s "example/cloud/<node>/solr" -z 
127.0.0.1:9983
{code}


> Recovery fails in Solr 9.7.0 if basic-auth is enabled
> -----------------------------------------------------
>
>                 Key: SOLR-17515
>                 URL: https://issues.apache.org/jira/browse/SOLR-17515
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 9.7
>            Reporter: Jason Gerlowski
>            Priority: Major
>
> Several reporters on the users@ list, recently shared a bug they noticed on 
> upgrading to Solr 9.7.  Replicas would try to recover, but fail with a 
> NullPointerException:
> {code}
> 2024-09-18 09:36:31.238 ERROR 
> (recoveryExecutor-12-thread-1-processing-fts06.host.internal:8983_solr 
> dovecot_fts_shard5_replica_n61 dovecot_fts shard5 core_node62) [c:dovecot_fts 
> s:shard5 r:core_node62 x:dovecot_fts_shard5_replica_n61 t:] 
> o.a.s.c.RecoveryStrategy Error while trying to recover. 
> core=dovecot_fts_shard5_replica_n61 => java.lang.NullPointerException: Cannot 
> invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>       at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.solr.client.solrj.impl.AuthenticationStoreHolder.updateAuthenticationStore(org.eclipse.jetty.client.api.AuthenticationStore)"
>  because "this.authenticationStore" is null
>       at 
> org.apache.solr.client.solrj.impl.Http2SolrClient.setAuthenticationStore(Http2SolrClient.java:318)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>       at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:97)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>       at 
> org.apache.solr.client.solrj.impl.PreemptiveBasicAuthClientBuilderFactory.setup(PreemptiveBasicAuthClientBuilderFactory.java:85)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>       at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.httpClientBuilderSetup(Http2SolrClient.java:1093)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>       at 
> org.apache.solr.client.solrj.impl.Http2SolrClient$Builder.build(Http2SolrClient.java:1062)
>  ~[solr-solrj-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>       at 
> org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:907)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>       at 
> org.apache.solr.cloud.RecoveryStrategy.doSyncOrReplicateRecovery(RecoveryStrategy.java:633)
>  ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - 
> anshum - 2024-09-03 15:05:20]
>       at 
> org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:333) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
>       at 
> org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:309) 
> ~[solr-core-9.7.0.jar:9.7.0 675a41516e3f3bacfc975590773e7abdca444ff4 - anshum 
> - 2024-09-03 15:05:20]
>         ...
> {code}
> It turns out that the issue isn't specific to upgrading clusters: *any 9.7.0 
> cluster (new or existing/upgrading) that uses basic-auth will hit this NPE on 
> during replica recovery*.  The result is that replicas will fail to recover, 
> and sit marked as "recovering" indefinitely.
> The issue can be reproduced locally in a source-checkout using the following 
> steps:
> {code}
> git checkout branch_9_7
> ./gradlew clean assemble
> cd solr/packaging/build/solr-9.7.0-SNAPSHOT
> # At prompts, I chose: 4 nodes, "gettingstarted", 1 shard, 2 replicas, 
> "_default" configset
> bin/solr start -e cloud
> bin/solr post -c gettingstarted example/exampledocs/books.json
> # Stop the node containing the non-leader replica
> bin/solr stop -p <port>
> bin/solr post -c gettingstarted example/exampledocs/books.csv
> # Enable auth and trigger recovery by turning the node back on
> bin/solr auth enable -type basicAuth -credentials solr:solrRocks 
> -blockUnknown true
> # This line will need tweaked based on which Solr node was previously stopped
> "bin/solr" start --cloud -p <port> -s "example/cloud/<node>/solr" -z 
> 127.0.0.1:9983
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to