[
https://issues.apache.org/jira/browse/SOLR-9512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495751#comment-15495751
]
Alan Woodward commented on SOLR-9512:
-------------------------------------
Having played with this a bit, I think adding extra retry logic to
CloudSolrClient isn't the best solution; instead, I think we should make
directUpdatesToLeaders a hint, rather than a directive, and just make sure that
the leader is the first URL in the list passed to the load-balancer. We can
then check in the response if the leader was in fact the shard that served that
particular request, and if not, then we invalidate the collection cache.
[~cpoerschke] you worked on SOLR-9090, does this make sense to you?
> CloudSolrClient's cluster state cache can break direct updates to leaders
> -------------------------------------------------------------------------
>
> Key: SOLR-9512
> URL: https://issues.apache.org/jira/browse/SOLR-9512
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Alan Woodward
>
> This is the root cause of SOLR-9305 and (at least some of) SOLR-9390. The
> process goes something like this:
> Documents are added to the cluster via a CloudSolrClient, with
> directUpdatesToLeadersOnly set to true. CSC caches its view of the
> DocCollection. The leader then goes down, and is reassigned. Next time
> documents are added, CSC checks its cache again, and gets the old view of the
> DocCollection. It then tries to send the update directly to the old, now
> down, leader, and we get ConnectionRefused.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]