[ 
https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113004#comment-17113004
 ] 

Andrzej Bialecki commented on SOLR-13072:
-----------------------------------------

[~cjcowie] thanks for reporting this - I created a separate issue to track 
this: SOLR-14505.

> Management of markers for nodeLost / nodeAdded events is broken
> ---------------------------------------------------------------
>
>                 Key: SOLR-13072
>                 URL: https://issues.apache.org/jira/browse/SOLR-13072
>             Project: Solr
>          Issue Type: Bug
>          Components: AutoScaling
>    Affects Versions: 7.5, 7.6, 8.0
>            Reporter: Andrzej Bialecki
>            Assignee: Andrzej Bialecki
>            Priority: Major
>             Fix For: 7.7, 8.0, master (9.0)
>
>
> In order to prevent {{nodeLost}} events from being lost when it's the 
> Overseer leader that is the node that was lost a mechanism was added to 
> record markers for these events by any other live node, in 
> {{ZkController.registerLiveNodesListener()}}. As similar mechanism also 
> exists for {{nodeAdded}} events.
> On Overseer leader restart if the autoscaling configuration didn't contain 
> any triggers that consume {{nodeLost}} events then these markers are removed. 
> If there are 1 or more trigger configs that consume {{nodeLost}} events then 
> these triggers would read the markers, remove them and generate appropriate 
> events.
> However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is 
> broken and susceptible to race conditions.
> It's not unusual to have more than 1 {{nodeLost}} trigger because in addition 
> to any user-defined triggers there's always one that is automatically defined 
> if missing: {{.auto_add_replicas}}. However, if there's more than 1 
> {{nodeLost}} trigger then the process of consuming and removing the markers 
> becomes non-deterministic - each trigger may pick up (and delete) all, none, 
> or some of the markers.
> So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more 
> than 1 {{nodeAdded}} trigger is defined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to