[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113004#comment-17113004 ]
Andrzej Bialecki commented on SOLR-13072: ----------------------------------------- [~cjcowie] thanks for reporting this - I created a separate issue to track this: SOLR-14505. > Management of markers for nodeLost / nodeAdded events is broken > --------------------------------------------------------------- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Components: AutoScaling > Affects Versions: 7.5, 7.6, 8.0 > Reporter: Andrzej Bialecki > Assignee: Andrzej Bialecki > Priority: Major > Fix For: 7.7, 8.0, master (9.0) > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org