[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754834#comment-16754834 ] ASF subversion and git services commented on SOLR-13072: Commit 1cfbd3e1c84d35e741cfc068a8e88f0eff4ea9e1 in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1cfbd3e ] SOLR-13072: Make sure the new overseer leader is present. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7, master (9.0) > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754185#comment-16754185 ] ASF subversion and git services commented on SOLR-13072: Commit 5e1b08878f070baae459b110380ff95e77d0d7bc in lucene-solr's branch refs/heads/branch_8x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5e1b088 ] SOLR-13072: Make sure the new overseer leader is present. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7, master (9.0) > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16754166#comment-16754166 ] ASF subversion and git services commented on SOLR-13072: Commit 692e6381934739626db03c30fe398594f7d5ef33 in lucene-solr's branch refs/heads/branch_7x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=692e638 ] SOLR-13072: Make sure the new overseer leader is present. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7, master (9.0) > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16750951#comment-16750951 ] ASF subversion and git services commented on SOLR-13072: Commit 84819c837955c163f5a1a6203ed8005f22053fa8 in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=84819c8 ] SOLR-13072: Fix precommit. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7, master (9.0) > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16750948#comment-16750948 ] ASF subversion and git services commented on SOLR-13072: Commit 6f3d8a97706e1d5dca6b32de6e03953482315cdc in lucene-solr's branch refs/heads/branch_8x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6f3d8a9 ] SOLR-13072: Fix precommit. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7, master (9.0) > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16750943#comment-16750943 ] ASF subversion and git services commented on SOLR-13072: Commit aadce4f409f9b5105510a9e82c80a7e55bae46e2 in lucene-solr's branch refs/heads/branch_7x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=aadce4f ] SOLR-13072: Fix precommit. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7, master (9.0) > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16750138#comment-16750138 ] ASF subversion and git services commented on SOLR-13072: Commit 7aa260dfab4f6a4ee9d2240ace13b2d50f93c724 in lucene-solr's branch refs/heads/branch_7x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7aa260d ] SOLR-13072: Enable this test again. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7, master (9.0) > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16750079#comment-16750079 ] ASF subversion and git services commented on SOLR-13072: Commit 6489c5ad436f0cffc5f007a4564b4fa57853f916 in lucene-solr's branch refs/heads/branch_8x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6489c5a ] SOLR-13072: Enable this test again. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7, master (9.0) > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16750064#comment-16750064 ] ASF subversion and git services commented on SOLR-13072: Commit 72a99e9c5c4f4f7ebcfa63c5cd386d48a9a8da3c in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=72a99e9 ] SOLR-13072: Enable this test again. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7, master (9.0) > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741958#comment-16741958 ] ASF subversion and git services commented on SOLR-13072: Commit fea79a8f6b55707268955b1a59154e18c37da253 in lucene-solr's branch refs/heads/branch_7x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fea79a8 ] SOLR-13072: Wait for autoscaling config refresh to finish before modifying the cluster and enable the tests for now. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7 > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741959#comment-16741959 ] ASF subversion and git services commented on SOLR-13072: Commit 794f7f829cdf655f750c992803df1968a58f101e in lucene-solr's branch refs/heads/branch_7x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=794f7f8 ] SOLR-13072: Use the same wait in other simulated tests where the same race condition may occur. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7 > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741931#comment-16741931 ] ASF subversion and git services commented on SOLR-13072: Commit 229a0894fbcb152db4ca08119da085a002953943 in lucene-solr's branch refs/heads/branch_8x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=229a089 ] SOLR-13072: Wait for autoscaling config refresh to finish before modifying the cluster and enable the tests for now. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7 > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741932#comment-16741932 ] ASF subversion and git services commented on SOLR-13072: Commit b33df8dc0ff387e999348a03a748d466c2e6de50 in lucene-solr's branch refs/heads/branch_8x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b33df8d ] SOLR-13072: Use the same wait in other simulated tests where the same race condition may occur. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7 > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16738419#comment-16738419 ] Andrzej Bialecki commented on SOLR-13072: -- Sorry for the mess-up ^ this is obviously for SOLR-12730 and not this one. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7 > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16738410#comment-16738410 ] ASF subversion and git services commented on SOLR-13072: Commit 2bc9904696f3484e9fb901efdd0e9a27b450d2fd in lucene-solr's branch refs/heads/branch_8x from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2bc9904 ] SOLR-13072: Document the "splitFuzz" parameter. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7 > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16738411#comment-16738411 ] ASF subversion and git services commented on SOLR-13072: Commit 9423bdb0cf8dc83b626590c94486cbea44e34183 in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9423bdb ] SOLR-13072: Document the "splitFuzz" parameter. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7 > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16737358#comment-16737358 ] ASF subversion and git services commented on SOLR-13072: Commit 7db4121b4553568108e1cf91e82c68fc55b6e9f4 in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7db4121 ] SOLR-13072: Use the same wait in other simulated tests where the same race condition may occur. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7 > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16737227#comment-16737227 ] ASF subversion and git services commented on SOLR-13072: Commit a37e2c609cb26dfffa5b88f8a6b3afa2711880a5 in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a37e2c6 ] SOLR-13072: Wait for autoscaling config refresh to finish before modifying the cluster and enable the tests for now. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, 8.0 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: 8.0, 7.7 > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725913#comment-16725913 ] ASF subversion and git services commented on SOLR-13072: Commit 846dfbef39880af7f07ab762a1a0a123903966a1 in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=846dfbe ] SOLR-13072: Fix an api change. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, master (8.0) >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > Fix For: master (8.0), 7.7 > > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725830#comment-16725830 ] ASF subversion and git services commented on SOLR-13072: Commit 7a5aa3dfe6f4f257e475ffb6ef1d760822b2f0aa in lucene-solr's branch refs/heads/branch_7x from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7a5aa3d ] SOLR-13072: Fix a cherry-pick issue. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, master (8.0) >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725829#comment-16725829 ] ASF subversion and git services commented on SOLR-13072: Commit f4bd371e3ee905bd0955491b6a9b0c0797e9c77c in lucene-solr's branch refs/heads/branch_7x from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f4bd371 ] SOLR-13072: Management of markers for nodeLost / nodeAdded events is broken. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, master (8.0) >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725805#comment-16725805 ] ASF subversion and git services commented on SOLR-13072: Commit 1f0e875db65a0a2e9a8a62757aff1770ecf99866 in lucene-solr's branch refs/heads/master from Andrzej Bialecki [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1f0e875 ] SOLR-13072: Management of markers for nodeLost / nodeAdded events is broken. > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, master (8.0) >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720662#comment-16720662 ] ASF subversion and git services commented on SOLR-13072: Commit a4ca08fba69eb596c602aebddbcb8e1cfa623630 in lucene-solr's branch refs/heads/branch_7x from Chris Hostetter [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a4ca08f ] SOLR-13072: disable flawed test of flawed functionality (cherry picked from commit f844461357d43838da51697295a1dcbb69699d9c) > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, master (8.0) >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-13072) Management of markers for nodeLost / nodeAdded events is broken
[ https://issues.apache.org/jira/browse/SOLR-13072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16720663#comment-16720663 ] ASF subversion and git services commented on SOLR-13072: Commit f844461357d43838da51697295a1dcbb69699d9c in lucene-solr's branch refs/heads/master from Chris Hostetter [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f844461 ] SOLR-13072: disable flawed test of flawed functionality > Management of markers for nodeLost / nodeAdded events is broken > --- > > Key: SOLR-13072 > URL: https://issues.apache.org/jira/browse/SOLR-13072 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: AutoScaling >Affects Versions: 7.5, 7.6, master (8.0) >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Major > > In order to prevent {{nodeLost}} events from being lost when it's the > Overseer leader that is the node that was lost a mechanism was added to > record markers for these events by any other live node, in > {{ZkController.registerLiveNodesListener()}}. As similar mechanism also > exists for {{nodeAdded}} events. > On Overseer leader restart if the autoscaling configuration didn't contain > any triggers that consume {{nodeLost}} events then these markers are removed. > If there are 1 or more trigger configs that consume {{nodeLost}} events then > these triggers would read the markers, remove them and generate appropriate > events. > However, as the {{NodeMarkersRegistrationTest}} shows this mechanism is > broken and susceptible to race conditions. > It's not unusual to have more than 1 {{nodeLost}} trigger because in addition > to any user-defined triggers there's always one that is automatically defined > if missing: {{.auto_add_replicas}}. However, if there's more than 1 > {{nodeLost}} trigger then the process of consuming and removing the markers > becomes non-deterministic - each trigger may pick up (and delete) all, none, > or some of the markers. > So as it is now this mechanism is broken if more than 1 {{nodeLost}} or more > than 1 {{nodeAdded}} trigger is defined. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org