[ 
https://issues.apache.org/jira/browse/SOLR-9512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495751#comment-15495751
 ] 

Alan Woodward commented on SOLR-9512:
-------------------------------------

Having played with this a bit, I think adding extra retry logic to 
CloudSolrClient isn't the best solution; instead, I think we should make 
directUpdatesToLeaders a hint, rather than a directive, and just make sure that 
the leader is the first URL in the list passed to the load-balancer.  We can 
then check in the response if the leader was in fact the shard that served that 
particular request, and if not, then we invalidate the collection cache.  
[~cpoerschke] you worked on SOLR-9090, does this make sense to you?

> CloudSolrClient's cluster state cache can break direct updates to leaders
> -------------------------------------------------------------------------
>
>                 Key: SOLR-9512
>                 URL: https://issues.apache.org/jira/browse/SOLR-9512
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Alan Woodward
>
> This is the root cause of SOLR-9305 and (at least some of) SOLR-9390.  The 
> process goes something like this:
> Documents are added to the cluster via a CloudSolrClient, with 
> directUpdatesToLeadersOnly set to true.  CSC caches its view of the 
> DocCollection.  The leader then goes down, and is reassigned.  Next time 
> documents are added, CSC checks its cache again, and gets the old view of the 
> DocCollection.  It then tries to send the update directly to the old, now 
> down, leader, and we get ConnectionRefused.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to