[ 
https://issues.apache.org/jira/browse/SOLR-13815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16947933#comment-16947933
 ] 

Yonik Seeley commented on SOLR-13815:
-------------------------------------

OK, I've verified what the race is (or rather what this test case hits).

At the start of the distributed add (DistributedZkUpdateProcessor.setupRequest) 
the shard states are active/construction/construction (for 
parent-shard/sub-shard1/sub-shard2) and this parent shard receiving the request 
correctly thinks it is the leader.  Then later on for the same update in 
doDistribAdd(), the cluster state is re-retrieved and the shard states are now 
inactive/active/active.  getSubShardLeaders() thus returns null because it is 
only looking for shards with states of CONSTRUCTION or RECOVERY.

We could retrieve the cluster state once per request, and that would probably 
be a huge help.  Still... I don't think zookeeper can update multiple znodes at 
the same time, so we might still have a very small window where we see 
something like inactive/construction/construction.  I'm not sure what the 
behavior of the current code would be in that case.


> Live split can lose data
> ------------------------
>
>                 Key: SOLR-13815
>                 URL: https://issues.apache.org/jira/browse/SOLR-13815
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Yonik Seeley
>            Priority: Major
>         Attachments: fail.191004_053129, fail.191004_093307
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> This issue is to investigate potential data loss during a "live" split (i.e. 
> split happens while updates are flowing)
> This was discovered during the shared storage work which was based on a 
> non-release branch_8x sometime before 8.3, hence the first steps are to try 
> and reproduce on the master branch without any shared storage changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to