[ 
https://issues.apache.org/jira/browse/SOLR-13718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16918519#comment-16918519
 ] 

Ishan Chattopadhyaya commented on SOLR-13718:
---------------------------------------------

The above fix caused a test failure in TestLocalFSCloudBackupRestore. There is 
something wrong with ShardRequestTracker (OCMH)'s processResponses(), whereby 
the abortOnError is not respected in case of async requests. In this fix, I 
tried aborting (on error) the async requests as well. However, due to 
aforementioned wrong behaviour, the RestoreCmd was working around by adding 
additional checks, and hence the test started failing after my fix.

Fixing this the right way will require handling these async responses across 
all collection API commands uniformly, and will be a longer effort. For now, 
I'm going to revert my fix and handle the SPLITSHARD failure the same way as 
RestoreCmd is doing.

> SPLITSHARD using async can cause data loss
> ------------------------------------------
>
>                 Key: SOLR-13718
>                 URL: https://issues.apache.org/jira/browse/SOLR-13718
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 7.7.2, 8.1, 8.2
>            Reporter: Ishan Chattopadhyaya
>            Assignee: Ishan Chattopadhyaya
>            Priority: Major
>             Fix For: 7.7.3, 8.3
>
>         Attachments: SOLR-13718.patch, solr-13718-reproduce.sh, solr.zip
>
>
> When using SPLITSHARD with async, if there are underlying failures in the 
> SPLIT core command or other sub-commands of SPLITSHARD, then SPLITSHARD 
> succeeds and results in two empty sub-shards.
> There are various potential failures with SPLIT core command, here's a way to 
> reproduce using a Solr 6x index in Solr 7x.
> -Steps to reproduce (in Solr 7x):-
> {code}
> 1. Import the attached configset, and create a collection.
> 2. Move in the attached data directory (index created in Solr6x) in place of 
> the created collection's data directory. Do a collection RELOAD.
> 3. Issue a *:* query, we see 5 documents.
> 4. Issue a SPLITSHARD (async), and then issue *:*, we see 0 documents.
> {code}
> Check attached solr-13718-reproduce.sh script to do the same (without needing 
> the zip file).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to