Christine Poerschke created SOLR-5213:
-----------------------------------------

             Summary: collections?action=SPLITSHARD parent vs. sub-shards 
numDocs
                 Key: SOLR-5213
                 URL: https://issues.apache.org/jira/browse/SOLR-5213
             Project: Solr
          Issue Type: Improvement
          Components: update
    Affects Versions: 4.4
            Reporter: Christine Poerschke


The problem we saw was that splitting a shard took a long time and at the end 
of it the sub-shards contained fewer documents than the original shard.

The root cause was eventually tracked down to the disappearing documents not 
falling into the hash ranges of the sub-shards.

Could SolrIndexSplitter split report per-segment numDocs for parent and 
sub-shards, with at least a warning logged for any discrepancies (documents 
falling into none of the sub-shards or documents falling into several 
sub-shards)?

Additionally, could a case be made for erroring out when discrepancies are 
detected i.e. not proceeding with the shard split? Either to always error or to 
have an verifyNumDocs=false/true optional parameter for the SPLITSHARD action.




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to