Hi all,

We've got a solrcloud cluster set up on 6.3.0 with the BasicAuthentication
plugin enabled. All of the hosts are time synchronized using ntp and are on
the same network switch.

We're periodically experiencing issues where follower replicas are put into
down states by the leader in the case of requests that failed due to
invalid timestamps. To minimize the issue we've increased the pkiauth.ttl
value to 10000, and that seems to have taken care of most of the
occurrences.

As vague as the question is, is there anything specific with solr that we
could look into that would affect the requests having invalid keys?

We are working on tracking ntp's performance in case there was some sort of
lapse, but everything we've seen puts the hosts within around 20
milliseconds of each other at worst.

Possibly related but only noticed yesterday. A request for recovery was
sent from a leader to a follower replica and it didn't seem to have an
authorization header, and the wrong user was chosen.

2017-12-19 23:10:44.764 INFO  (qtp759156157-8224123) [   ]
o.a.s.s.RuleBasedAuthorizationPlugin This resource is configured to have a
permission {
  "name":"core-admin-edit",
  "role":"admin"}, The principal [principal: solrwriter] does not have the
right role
2017-12-19 23:10:44.765 INFO  (qtp759156157-8224123) [   ]
o.a.s.s.HttpSolrCall USER_REQUIRED auth header null context :
userPrincipal: [[principal: solrwriter]] type: [ADMIN], collections: [],
Path: [/admin/cores] path : /admin/cores params
:core=Feeds_shard11_replica2&action=REQUESTRECOVERY&wt=javabin&version=2

How does solr determine what user/authentication to use for inter-node
requests? Are there any of the predefined permissions that we shouldn't
have assigned to a user that are causing this?

Thanks,
Chris

Reply via email to