Andrzej Bialecki  created SOLR-12729:
----------------------------------------

             Summary: SplitShardCmd should lock the parent shard to prevent 
parallel splitting requests
                 Key: SOLR-12729
                 URL: https://issues.apache.org/jira/browse/SOLR-12729
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: AutoScaling
            Reporter: Andrzej Bialecki 
            Assignee: Andrzej Bialecki 
             Fix For: 7.5


This scenario was discovered by the simulation framework, but it exists also in 
the non-simulated code.

When {{IndexSizeTrigger}} requests SPLITSHARD, which is then successfully 
started and “completed” from the point of view of {{ExecutePlanAction}}, the 
reality is that it still can take significant amount of time until the moment 
when the new replicas fully recover and cause the switch of shard states 
(parent to INACTIVE, child from RECOVERY to ACTIVE).

If this time is longer than the trigger's {{waitFor}} the trigger will issue 
the same SPLITSHARD request again. {{SplitShardCmd}} doesn't prevent this new 
request from being processed because the parent shard is still ACTIVE. However, 
a section of the code in {{SplitShardCmd}} will realize that sub-slices with 
the target names already exist and they are not active, at which point it will 
delete the new sub-slices ({{SplitShardCmd:182}}).

The end result is an infinite loop, where {{IndexSizeTrigger}} will keep 
generating SPLITSHARD, and {{SplitShardCmd}} will keep deleting the recovering 
sub-slices created by the previous command.

A simple solution is for the parent shard to be marked to indicate that it’s in 
a process of splitting, so that no other split is attempted on the same shard. 
Furthermore, {{IndexSizeTrigger}} could temporarily exclude such shards from 
monitoring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to