[
https://issues.apache.org/jira/browse/SOLR-16414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17628565#comment-17628565
]
Patson Luk commented on SOLR-16414:
-----------------------------------
Another kinda minor thought is perhaps we should always check `isClosed()` for
`ZkCmdExecutor#retryOperation`. So it should not sleep a minimum of 1.5 sec
even if connection is closed. It's not major, perhaps the check was there in
case `isClosed()` is expensive to call ?
https://github.com/apache/solr/blob/main/solr/solrj-zookeeper/src/java/org/apache/solr/common/cloud/ZkCmdExecutor.java#L69
> Race condition in PRS state updates
> -----------------------------------
>
> Key: SOLR-16414
> URL: https://issues.apache.org/jira/browse/SOLR-16414
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Noble Paul
> Assignee: Noble Paul
> Priority: Major
> Fix For: 9.1
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> For PRS collections the individual states are potentially updated from
> individual nodes and sometimes from overseer too. it's possible that
>
> # OP1 is sent to overseer at T1
> # OP2 is executed in the node itself at T2
>
> Because we cannot guarantee that the OP1 sent to overseer may execute before
> OP2 tyhe final state will be the result of OP1 which is incorrect and can
> lead to errors .
> The solution is to never do any PRS writes from overseer.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]