[jira] [Updated] (IGNITE-21313) Incorrect behaviour when invalid zone filter is applied to zone
[ https://issues.apache.org/jira/browse/IGNITE-21313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-21313: --- Description: Let's consider this code to be run in a test: {code:java} sql("CREATE ZONE ZONE1 WITH DATA_NODES_FILTER = 'INCORRECT_FILTER'"); sql("CREATE TABLE TEST(ID INT PRIMARY KEY, VAL0 INT) WITH PRIMARY_ZONE='ZONE1'"); {code} Current behaviour is that test hangs with spamming {noformat} [2024-01-19T12:56:25,163][ERROR][%ictdt_n_0%metastorage-watch-executor-2][WatchProcessor] Error occurred when notifying safe time advanced callback java.util.concurrent.CompletionException: com.jayway.jsonpath.PathNotFoundException: No results for path: $['INCORRECT_FILTER'] at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) ~[?:?] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870) ~[?:?] at java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:883) [?:?] at java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2257) [?:?] at org.apache.ignite.internal.metastorage.server.WatchProcessor.notifyWatches(WatchProcessor.java:213) ~[main/:?] at org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$notifyWatches$3(WatchProcessor.java:169) ~[main/:?] at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072) [?:?] at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: com.jayway.jsonpath.PathNotFoundException: No results for path: $['INCORRECT_FILTER']{noformat} We need to fix that and formulate a reaction to an incorrect filter *Implementation notes:* To fix it we need to change implementation of DistributionZonesUtil#filter. Instead of {code:java} List> res = JsonPath.read(convertedAttributes, filter);{code} need to use {code:java} Configuration configuration = new Configuration.ConfigurationBuilder() .options(Option.SUPPRESS_EXCEPTIONS, Option.ALWAYS_RETURN_LIST) .build(); List> res = JsonPath.using(configuration).parse(convertedAttributes).read(filter);{code} In this case incorrect filter will not throw PathNotFoundException and returns empty 'res'. was: Let's consider this code to be run in a test: {code:java} sql("CREATE ZONE ZONE1 WITH DATA_NODES_FILTER = 'INCORRECT_FILTER'"); sql("CREATE TABLE TEST(ID INT PRIMARY KEY, VAL0 INT) WITH PRIMARY_ZONE='ZONE1'"); {code} Current behaviour is that test hangs with spamming {noformat} [2024-01-19T12:56:25,163][ERROR][%ictdt_n_0%metastorage-watch-executor-2][WatchProcessor] Error occurred when notifying safe time advanced callback java.util.concurrent.CompletionException: com.jayway.jsonpath.PathNotFoundException: No results for path: $['INCORRECT_FILTER'] at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) ~[?:?] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870) ~[?:?] at java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:883) [?:?] at java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2257) [?:?] at org.apache.ignite.internal.metastorage.server.WatchProcessor.notifyWatches(WatchProcessor.java:213) ~[main/:?] at org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$notifyWatches$3(WatchProcessor.java:169) ~[main/:?] at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072) [?:?] at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: com.jayway.jsonpath.PathNotFoundException: No results for path: $['INCORRECT_FILTER']{noformat} We need to fix that and formulate a reaction to an incorrect filter *Implementation notes:* To fix it we need to change DistributionZonesUtil#filter implementation. Instead of {code:java} List> res = JsonPath.read(convertedAttributes, filter);{code} need to use {code:java} Configuration configuration = new Configuration.ConfigurationBuilder() .options(Option.SUPPRESS_EXCEPTIONS,
[jira] [Updated] (IGNITE-21313) Incorrect behaviour when invalid zone filter is applied to zone
[ https://issues.apache.org/jira/browse/IGNITE-21313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-21313: --- Description: Let's consider this code to be run in a test: {code:java} sql("CREATE ZONE ZONE1 WITH DATA_NODES_FILTER = 'INCORRECT_FILTER'"); sql("CREATE TABLE TEST(ID INT PRIMARY KEY, VAL0 INT) WITH PRIMARY_ZONE='ZONE1'"); {code} Current behaviour is that test hangs with spamming {noformat} [2024-01-19T12:56:25,163][ERROR][%ictdt_n_0%metastorage-watch-executor-2][WatchProcessor] Error occurred when notifying safe time advanced callback java.util.concurrent.CompletionException: com.jayway.jsonpath.PathNotFoundException: No results for path: $['INCORRECT_FILTER'] at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) ~[?:?] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870) ~[?:?] at java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:883) [?:?] at java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2257) [?:?] at org.apache.ignite.internal.metastorage.server.WatchProcessor.notifyWatches(WatchProcessor.java:213) ~[main/:?] at org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$notifyWatches$3(WatchProcessor.java:169) ~[main/:?] at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072) [?:?] at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: com.jayway.jsonpath.PathNotFoundException: No results for path: $['INCORRECT_FILTER']{noformat} We need to fix that and formulate a reaction to an incorrect filter *Implementation notes:* To fix it we need to change DistributionZonesUtil#filter implementation. Instead of {code:java} List> res = JsonPath.read(convertedAttributes, filter);{code} need to use {code:java} Configuration configuration = new Configuration.ConfigurationBuilder() .options(Option.SUPPRESS_EXCEPTIONS, Option.ALWAYS_RETURN_LIST) .build(); List> res = JsonPath.using(configuration).parse(convertedAttributes).read(filter);{code} In this case incorrect filter will not throw PathNotFoundException and returns empty 'res'. was: Let's consider this code to be run in a test: {code:java} sql("CREATE ZONE ZONE1 WITH DATA_NODES_FILTER = 'INCORRECT_FILTER'"); sql("CREATE TABLE TEST(ID INT PRIMARY KEY, VAL0 INT) WITH PRIMARY_ZONE='ZONE1'"); {code} Current behaviour is that test hangs with spamming {noformat} [2024-01-19T12:56:25,163][ERROR][%ictdt_n_0%metastorage-watch-executor-2][WatchProcessor] Error occurred when notifying safe time advanced callback java.util.concurrent.CompletionException: com.jayway.jsonpath.PathNotFoundException: No results for path: $['INCORRECT_FILTER'] at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?] at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) ~[?:?] at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:870) ~[?:?] at java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:883) [?:?] at java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2257) [?:?] at org.apache.ignite.internal.metastorage.server.WatchProcessor.notifyWatches(WatchProcessor.java:213) ~[main/:?] at org.apache.ignite.internal.metastorage.server.WatchProcessor.lambda$notifyWatches$3(WatchProcessor.java:169) ~[main/:?] at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072) [?:?] at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:478) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: com.jayway.jsonpath.PathNotFoundException: No results for path: $['INCORRECT_FILTER']{noformat} We need to fix that and formulate a reaction to an incorrect filter > Incorrect behaviour when invalid zone filter is applied to zone > > > Key: IGNITE-21313 > URL: https://issues.apache.org/jira/browse/IGNITE-21313 > Project: Ignite > Issue Type: Bug >Reporter: Mirza
[jira] [Updated] (IGNITE-20412) Fix DistributionZoneCausalityDataNodesTest.java#checkDataNodesRepeated
[ https://issues.apache.org/jira/browse/IGNITE-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20412: --- Summary: Fix DistributionZoneCausalityDataNodesTest.java#checkDataNodesRepeated (was: Fix ItIgniteDistributionZoneManagerNodeRestartTest# testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart) > Fix DistributionZoneCausalityDataNodesTest.java#checkDataNodesRepeated > -- > > Key: IGNITE-20412 > URL: https://issues.apache.org/jira/browse/IGNITE-20412 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 10m > Remaining Estimate: 0h > > h3. Motivation > org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart > started to fall in the catalog-feature branch and fails in the main branch > after catalog-feature is merged > [https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=] > {code:java} > java.lang.AssertionError: > Expected: is <[]> > but: was <[A]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459) > at > org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539) > {code} > h3. Implementation notes > The root cause: > # This test changes metaStorageManager behavior and it throws expected > exception on ms.invoke. > # The test alters zone with new filter. > # DistributionZoneManager#onUpdateFilter return a future from > saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken) > # The future is completed exceptionally and > WatchProcessor#notificationFuture will be completed exceptionally. > # Next updates will not be handled properly because notificationFuture is > completed exceptionally. > We have already created tickets obout exception handling: > * https://issues.apache.org/jira/browse/IGNITE-14693 > * https://issues.apache.org/jira/browse/IGNITE-14611 > > The test scenario is incorrect because the node should be stopped (by failure > handler) if the ms.invoke failed. We need to rewrite it when the DZM restart > will be updated. > UPD1: > I've tried to rewrite test, so we could not throw exception in metastorage > handler, but just force thread to wait in this invoke, but this lead the to > the problem that because we use spy on Standalone Metastorage, and mockito > use synchronised block when we call ms.invoke, so that leads to the problem > that blocking of one invoke leads to blocking all other communication with ms. > Need further investigation how to rewrite this test > > UPD2: > The testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart > test was removed under another commit. But there is another test which is > disabled by this ticket. And it is fixed now. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20412) Fix ItIgniteDistributionZoneManagerNodeRestartTest# testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
[ https://issues.apache.org/jira/browse/IGNITE-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20412: --- Description: h3. Motivation org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart started to fall in the catalog-feature branch and fails in the main branch after catalog-feature is merged [https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=] {code:java} java.lang.AssertionError: Expected: is <[]> but: was <[A]> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459) at org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539) {code} h3. Implementation notes The root cause: # This test changes metaStorageManager behavior and it throws expected exception on ms.invoke. # The test alters zone with new filter. # DistributionZoneManager#onUpdateFilter return a future from saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken) # The future is completed exceptionally and WatchProcessor#notificationFuture will be completed exceptionally. # Next updates will not be handled properly because notificationFuture is completed exceptionally. We have already created tickets obout exception handling: * https://issues.apache.org/jira/browse/IGNITE-14693 * https://issues.apache.org/jira/browse/IGNITE-14611 The test scenario is incorrect because the node should be stopped (by failure handler) if the ms.invoke failed. We need to rewrite it when the DZM restart will be updated. UPD1: I've tried to rewrite test, so we could not throw exception in metastorage handler, but just force thread to wait in this invoke, but this lead the to the problem that because we use spy on Standalone Metastorage, and mockito use synchronised block when we call ms.invoke, so that leads to the problem that blocking of one invoke leads to blocking all other communication with ms. Need further investigation how to rewrite this test UPD2: The testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart test was removed under another commit. But there is another test which is disabled by this ticket. And it is fixed now. was: h3. Motivation org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart started to fall in the catalog-feature branch and fails in the main branch after catalog-feature is merged [https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=] {code:java} java.lang.AssertionError: Expected: is <[]> but: was <[A]> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459) at org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539) {code} h3. Implementation notes The root cause: # This test changes metaStorageManager behavior and it throws expected exception on ms.invoke. # The test alters zone with new filter. # DistributionZoneManager#onUpdateFilter return a future from saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken) # The future is completed exceptionally and WatchProcessor#notificationFuture will be completed exceptionally. # Next updates will not be handled properly because notificationFuture is completed exceptionally. We have already created tickets obout exception handling: * https://issues.apache.org/jira/browse/IGNITE-14693 * https://issues.apache.org/jira/browse/IGNITE-14611 The test scenario is incorrect because the node should be stopped (by failure handler) if the ms.invoke failed. We need to rewrite it when the DZM restart will be updated. UPD1: I've tried to rewrite test, so we could not throw exception in metastorage handler, but just force thread to wait in this invoke, but this lead the to the problem that because we use spy on Standalone Metastorage, and mockito use synchronised block when we call ms.invoke, so that leads to the problem that blocking of one invoke leads to blocking all other communication with ms. Need further investigation how to rewrite this test
[jira] [Updated] (IGNITE-20412) Fix ItIgniteDistributionZoneManagerNodeRestartTest# testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
[ https://issues.apache.org/jira/browse/IGNITE-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20412: --- Description: h3. Motivation org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart started to fall in the catalog-feature branch and fails in the main branch after catalog-feature is merged [https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=] {code:java} java.lang.AssertionError: Expected: is <[]> but: was <[A]> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459) at org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539) {code} h3. Implementation notes The root cause: # This test changes metaStorageManager behavior and it throws expected exception on ms.invoke. # The test alters zone with new filter. # DistributionZoneManager#onUpdateFilter return a future from saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken) # The future is completed exceptionally and WatchProcessor#notificationFuture will be completed exceptionally. # Next updates will not be handled properly because notificationFuture is completed exceptionally. We have already created tickets obout exception handling: * https://issues.apache.org/jira/browse/IGNITE-14693 * https://issues.apache.org/jira/browse/IGNITE-14611 The test scenario is incorrect because the node should be stopped (by failure handler) if the ms.invoke failed. We need to rewrite it when the DZM restart will be updated. UPD1: I've tried to rewrite test, so we could not throw exception in metastorage handler, but just force thread to wait in this invoke, but this lead the to the problem that because we use spy on Standalone Metastorage, and mockito use synchronised block when we call ms.invoke, so that leads to the problem that blocking of one invoke leads to blocking all other communication with ms. Need further investigation how to rewrite this test UPD2: was: h3. Motivation org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart started to fall in the catalog-feature branch and fails in the main branch after catalog-feature is merged [https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=] {code:java} java.lang.AssertionError: Expected: is <[]> but: was <[A]> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459) at org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539) {code} h3. Implementation notes The root cause: # This test changes metaStorageManager behavior and it throws expected exception on ms.invoke. # The test alters zone with new filter. # DistributionZoneManager#onUpdateFilter return a future from saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken) # The future is completed exceptionally and WatchProcessor#notificationFuture will be completed exceptionally. # Next updates will not be handled properly because notificationFuture is completed exceptionally. We have already created tickets obout exception handling: * https://issues.apache.org/jira/browse/IGNITE-14693 * https://issues.apache.org/jira/browse/IGNITE-14611 The test scenario is incorrect because the node should be stopped (by failure handler) if the ms.invoke failed. We need to rewrite it when the DZM restart will be updated. UPD1: I've tried to rewrite test, so we could not throw exception in metastorage handler, but just force thread to wait in this invoke, but this lead the to the problem that because we use spy on Standalone Metastorage, and mockito use synchronised block when we call ms.invoke, so that leads to the problem that blocking of one invoke leads to blocking all other communication with ms. Need further investigation how to rewrite this test > Fix ItIgniteDistributionZoneManagerNodeRestartTest# > testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart >
[jira] [Commented] (IGNITE-19955) Fix create zone on restart rewrites existing data nodes because of trigger key inconsistnecy
[ https://issues.apache.org/jira/browse/IGNITE-19955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17796615#comment-17796615 ] Sergey Uttsel commented on IGNITE-19955: LGTM > Fix create zone on restart rewrites existing data nodes because of trigger > key inconsistnecy > > > Key: IGNITE-19955 > URL: https://issues.apache.org/jira/browse/IGNITE-19955 > Project: Ignite > Issue Type: Bug >Reporter: Mirza Aliev >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Time Spent: 1h > Remaining Estimate: 0h > > Outdated, see UPD > Currently we have the logic for initialisation of newly created zone that it > writes keys > {noformat} > zoneDataNodesKey(zoneId), zoneScaleUpChangeTriggerKey(zoneId), > zoneScaleDownChangeTriggerKey(zoneId), zonesChangeTriggerKey(zoneId) > {noformat} > to metastorage, and condition is > {noformat} > static CompoundCondition triggerKeyConditionForZonesChanges(long > revision, int zoneId) { > return or( > notExists(zonesChangeTriggerKey(zoneId)), > > value(zonesChangeTriggerKey(zoneId)).lt(ByteUtils.longToBytes(revision)) > ); > {noformat} > Recovery process implies that the create zone event will be processed again, > but with the higher revision, so data nodes will be rewritten. > We need to handle this situation, so data nodes will be consistent after > restart. > Possible solution is to change condition to > {noformat} > static SimpleCondition triggerKeyConditionForZonesCreation(long revision, > int zoneId) { > return notExists(zonesChangeTriggerKey(zoneId)); > } > static SimpleCondition triggerKeyConditionForZonesDelete(int zoneId) { > return exists(zonesChangeTriggerKey(zoneId)); > } > {noformat} > > so we could not rely on revision and check only existence of the key, when we > create or remove zone. The problem in this solution is that reordering of the > create and remove on some node could lead to not consistent state for zones > key in metastorage > *UPD*: > This problem will be resolves once we implement > https://issues.apache.org/jira/browse/IGNITE-20561 > In this ticket we need to unmute all tickets that were muted by this ticket -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20605) Restore scaleUp/scaleDown timers
[ https://issues.apache.org/jira/browse/IGNITE-20605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791980#comment-17791980 ] Sergey Uttsel commented on IGNITE-20605: LGTM > Restore scaleUp/scaleDown timers > > > Key: IGNITE-20605 > URL: https://issues.apache.org/jira/browse/IGNITE-20605 > Project: Ignite > Issue Type: Bug >Reporter: Mirza Aliev >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > h3. *Motivation* > We need to restore timers that were scheduled before node restart. > h3. *Definition of done* > Timers are rescheduled after restart > h3. *Implementation notes* > It is valid to just schedule local timers according to scaleUp/ScaleDown > timers values from the Catalog, and as a revision take maxScUpFromMap or > maxScDownFromMap from topologyAugmentationMap, where maxScUpFromMap and > maxScDownFromMap are max revision from topologyAugmentationMap of the entry, > which was associated with addition and removal of nodes respectively -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20939) Extract Distribution zones integration tests from runner module to separate one
[ https://issues.apache.org/jira/browse/IGNITE-20939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel reassigned IGNITE-20939: -- Assignee: Sergey Uttsel > Extract Distribution zones integration tests from runner module to separate > one > --- > > Key: IGNITE-20939 > URL: https://issues.apache.org/jira/browse/IGNITE-20939 > Project: Ignite > Issue Type: Bug >Reporter: Mirza Aliev >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > Runner module incorporate big amount of tests related to another modules. > That's lead to long running time for the suite on TC. > Currently, integration tests for Distribution Zones located in ignite-runner > module. So, we need to extract it to the separate module via runner > test-fixtures support to decrease the execution time of tests for the runner > module. > As reference for such activities could be used IGNITE-20670 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20559) Return metastorage invokes in DistributionZoneManager#createMetastorageTopologyListener
[ https://issues.apache.org/jira/browse/IGNITE-20559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17785924#comment-17785924 ] Sergey Uttsel commented on IGNITE-20559: LGTM > Return metastorage invokes in > DistributionZoneManager#createMetastorageTopologyListener > --- > > Key: IGNITE-20559 > URL: https://issues.apache.org/jira/browse/IGNITE-20559 > Project: Ignite > Issue Type: Bug >Reporter: Mirza Aliev >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # LogicalTopologyEventListener to update logical topology. > Also we need to save {{nodeAttriburtes}} and {{topologyAugmentationMap}} in MS > h3. *Definition of Done* > Need to ensure event handling linearization. All immediate data nodes > recalculation must be returned to the event handler. Also > {{nodeAttriburtes}} and {{topologyAugmentationMap}} in must be saved in MS, > so we can use this fields when recovery DZM > h3. *Implementation notes* > When topology update is handled (createMetastorageTopologyListener), > immediately recalculate data nodes within caller handler for all zones, which > have immediate timer. Also within the caller handler, write nodesAttributes > and topologyAugmentationMaps to metastore. Only after completion of this ms > invoke, schedule local timers. All futures of these changes must be returned > as a result of the watch listener update, so this update could be marked as > processed only after all above mentioned actions are completed. > For CAS-ing nodesAttributes and topologyAugmentationMaps, we can reuse > DistributionZonesUtil#zonesGlobalStateRevision, but change it from vault key > to MS and make it per zone. Every time we try to save these keys, we will > take revision of topology update (topRev) and will try to write changes with > condition topRev > ms.zonesGlobalStateRevision > To clean up this map, we can remove all augmentations that are less than > min(scaleUpTriggerKeys, scaleDownTriggerKeys) when we write the new one. > Further optimization: we can save only one topologyAugmentationMap, the only > question is how to clean up the map. We can find minimal > min(scaleUPTriggerKeys, scaleDownTriggerKeys) among all zones and clean up > all up to this minimum. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-16431) Entry expiration requires twice the entry size of heap
[ https://issues.apache.org/jira/browse/IGNITE-16431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel reassigned IGNITE-16431: -- Assignee: Sergey Uttsel > Entry expiration requires twice the entry size of heap > -- > > Key: IGNITE-16431 > URL: https://issues.apache.org/jira/browse/IGNITE-16431 > Project: Ignite > Issue Type: Improvement >Affects Versions: 2.12 >Reporter: Alexey Kukushkin >Assignee: Sergey Uttsel >Priority: Major > Labels: cggg > Attachments: 500MB-put-expiry-master.png > > Original Estimate: 64h > Remaining Estimate: 64h > > Ignite takes twice the entry size off the heap to expire the entry when > {{{}eagerTtl=true{}}}. See the attached heap memory usage diagram of putting > and then expiring a 500MB entry in Ignite. > This makes Ignite inefficient with handling large objects causing > {{OutOfMemory}} errors. > Do we really need loading entry's value on heap at all to expiry the entry? > Please enhance Ignite cache entry expiration not to load the entry's value on > heap even once or explain why it is not possible. > !500MB-put-expiry-master.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-20160) NullPointerException in FSMCallerImpl.doCommitted
[ https://issues.apache.org/jira/browse/IGNITE-20160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel resolved IGNITE-20160. Resolution: Duplicate Fixed in https://issues.apache.org/jira/browse/IGNITE-20774 > NullPointerException in FSMCallerImpl.doCommitted > - > > Key: IGNITE-20160 > URL: https://issues.apache.org/jira/browse/IGNITE-20160 > Project: Ignite > Issue Type: Bug >Affects Versions: 3.0.0-beta1 >Reporter: Pavel Tupitsyn >Assignee: Sergey Uttsel >Priority: Blocker > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > Time Spent: 20m > Remaining Estimate: 0h > > {code:java} > java.lang.NullPointerException > at > org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:496) > at > org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:448) > at > org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:136) > at > org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:130) > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:226) > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) > {code} > [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRunnerSqlLogic/7410174?hideProblemsFromDependencies=false=false=true=true] > It happens here (see FSMCallerImpl#doCommitted): > {code:java} > final IteratorImpl iterImpl = new IteratorImpl(this.fsm, this.logManager, > closures, firstClosureIndex, > lastAppliedIndex, committedIndex, this.applyingIndex, > this.node.getOptions());{code} > on the 2nd line, most likely on resolving null pointer to *node,* which is > nullified on FSMCaller shutdown. Raft groups were being stopped in this > moment. > *Implementation details* > A simple fix to avoid the NPE at the aforementioned line would be to check > `node` for null. > Additionally it would be nice to check `shutdownLatch` in `doCommitted` and > make sure we call `unsubscribe` in a proper order. > The reason is that doCommitted is called from a disruptor's callback. > One more observation - the reference to a node is set to null in `shutdown` > that is called before `join` where we unsubscribe from notifications from > Disruptor. There is a little chance that something comes into FSMCallerImpl > after shutdown but before join. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774956#comment-17774956 ] Sergey Uttsel commented on IGNITE-20317: Now LGTM > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > # DistributionZoneManager#onUpdateScaleUp > # DistributionZoneManager#onUpdateScaleDown > -DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on > replicas update.- > -LogicalTopologyEventListener to update logical topology.- > -DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener > watch listener to update pending assignments.- > h3. *Definition of Done* > Need to ensure event handling linearization. All immediate data nodes > recalculation must be returned to the event handler. > h3. *Implementation Notes* > * ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, > DistributionZoneManager#onUpdateFilter and > DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration > listeners. So we can just return the ms invoke future from these methods > and it ensure, that this invoke will be completed within the current event > handling. > * We cannnot return future from LogicalTopologyEventListener's methods. We > can ignore these futures. It has drawback: we can skip the topology update > # topology=[A,B], dataNodes=[A,B], scaleUp=0, scaleDown=100 > # Node C was joined to the topology and left quickly and ms invokes to update > topology entry was reordered. > # data nodes was not updated immediately to [A,B,C]. > We think that we can ignore this bug because eventually it doesn't break the > consistency of the date node. For this purpose we need to change the invoke > condition: > `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` > instead of > `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() > - 1))` > * Need to return ms invoke futures from WatchListener#onUpdate method of the > data nodes listener. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774878#comment-17774878 ] Sergey Uttsel edited comment on IGNITE-20317 at 10/13/23 11:45 AM: --- Some action points of the ticket were not implemented: {code:java} LogicalTopologyEventListener to update logical topology. we need to change the invoke condition: `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` instead of `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() - 1))` {code} {code:java} DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments.{code} was (Author: sergey uttsel): Some action points of the ticket were not implemented: # LogicalTopologyEventListener to update logical topology. # DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > # DistributionZoneManager#onUpdateScaleUp > # DistributionZoneManager#onUpdateScaleDown > -DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on > replicas update.- > -LogicalTopologyEventListener to update logical topology.- > -DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener > watch listener to update pending assignments.- > h3. *Definition of Done* > Need to ensure event handling linearization. All immediate data nodes > recalculation must be returned to the event handler. > h3. *Implementation Notes* > * ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, > DistributionZoneManager#onUpdateFilter and > DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration > listeners. So we can just return the ms invoke future from these methods > and it ensure, that this invoke will be completed within the current event > handling. > * We cannnot return future from LogicalTopologyEventListener's methods. We > can ignore these futures. It has drawback: we can skip the topology update > # topology=[A,B], dataNodes=[A,B], scaleUp=0, scaleDown=100 > # Node C was joined to the topology and left quickly and ms invokes to update > topology entry was reordered. > # data nodes was not updated immediately to [A,B,C]. > We think that we can ignore this bug because eventually it doesn't break the > consistency of the date node. For this purpose we need to change the invoke > condition: > `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` > instead of > `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() > - 1))` > * Need to return ms invoke futures from WatchListener#onUpdate method of the > data nodes listener. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774878#comment-17774878 ] Sergey Uttsel edited comment on IGNITE-20317 at 10/13/23 11:45 AM: --- Some action points of the ticket were not implemented: {code:java} LogicalTopologyEventListener to update logical topology. we need to change the invoke condition: `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` instead of `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() - 1))` {code} {code:java} DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments.{code} was (Author: sergey uttsel): Some action points of the ticket were not implemented: {code:java} LogicalTopologyEventListener to update logical topology. we need to change the invoke condition: `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` instead of `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() - 1))` {code} {code:java} DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments.{code} > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > # DistributionZoneManager#onUpdateScaleUp > # DistributionZoneManager#onUpdateScaleDown > -DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on > replicas update.- > -LogicalTopologyEventListener to update logical topology.- > -DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener > watch listener to update pending assignments.- > h3. *Definition of Done* > Need to ensure event handling linearization. All immediate data nodes > recalculation must be returned to the event handler. > h3. *Implementation Notes* > * ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, > DistributionZoneManager#onUpdateFilter and > DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration > listeners. So we can just return the ms invoke future from these methods > and it ensure, that this invoke will be completed within the current event > handling. > * We cannnot return future from LogicalTopologyEventListener's methods. We > can ignore these futures. It has drawback: we can skip the topology update > # topology=[A,B], dataNodes=[A,B], scaleUp=0, scaleDown=100 > # Node C was joined to the topology and left quickly and ms invokes to update > topology entry was reordered. > # data nodes was not updated immediately to [A,B,C]. > We think that we can ignore this bug because eventually it doesn't break the > consistency of the date node. For this purpose we need to change the invoke > condition: > `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` > instead of > `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() > - 1))` > * Need to return ms invoke futures from WatchListener#onUpdate method of the > data nodes listener. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17774878#comment-17774878 ] Sergey Uttsel commented on IGNITE-20317: Some action points of the ticket were not implemented: # LogicalTopologyEventListener to update logical topology. # DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Mirza Aliev >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > # DistributionZoneManager#onUpdateScaleUp > # DistributionZoneManager#onUpdateScaleDown > -DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on > replicas update.- > -LogicalTopologyEventListener to update logical topology.- > -DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener > watch listener to update pending assignments.- > h3. *Definition of Done* > Need to ensure event handling linearization. All immediate data nodes > recalculation must be returned to the event handler. > h3. *Implementation Notes* > * ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, > DistributionZoneManager#onUpdateFilter and > DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration > listeners. So we can just return the ms invoke future from these methods > and it ensure, that this invoke will be completed within the current event > handling. > * We cannnot return future from LogicalTopologyEventListener's methods. We > can ignore these futures. It has drawback: we can skip the topology update > # topology=[A,B], dataNodes=[A,B], scaleUp=0, scaleDown=100 > # Node C was joined to the topology and left quickly and ms invokes to update > topology entry was reordered. > # data nodes was not updated immediately to [A,B,C]. > We think that we can ignore this bug because eventually it doesn't break the > consistency of the date node. For this purpose we need to change the invoke > condition: > `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` > instead of > `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() > - 1))` > -* Need to return ms invoke futures from WatchListener#onUpdate method of the > data nodes listener.- -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-20599) Implement a 'not' operation in the meta storage dsl.
Sergey Uttsel created IGNITE-20599: -- Summary: Implement a 'not' operation in the meta storage dsl. Key: IGNITE-20599 URL: https://issues.apache.org/jira/browse/IGNITE-20599 Project: Ignite Issue Type: Improvement Reporter: Sergey Uttsel *Motivation* In https://issues.apache.org/jira/browse/IGNITE-20561 we need to create a condition for a ms invoke with negation. We could do this two ways: {code:java} and( notExists(dataNodes(zoneId)), notTombstone(dataNodes(zoneId)) ){code} or {code:java} not( or( exists(dataNodes(zoneId)), tombstone(dataNodes(zoneId)) ) ){code} But there are no `notTombstone` or `not` methods in the meta storage dsl. I propose to implement `not` operation because it is more general approach and it can be reused with other conditions. *Definition of done* `not` operation is implemented. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20561) Change condition for DistributionZonesUtil#triggerKeyConditionForZonesChanges to use ConditionType#TOMBSTONE
[ https://issues.apache.org/jira/browse/IGNITE-20561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20561: --- Description: *Motivation* Currently we use zonesChangeTriggerKey in DistributionZonesUtil#triggerKeyConditionForZonesChanges to create condition to initialize the zone's meta storage keys on a zone creation and to remove these keys on a zone drop. It cause some issues: # we cannot remove zonesChangeTriggerKey to ensure that on the zone will not recreated on DZM restart # it doesn't work properly now because it possible that on DZM restart the zone will be recreated with the revision which is higher than original the zone create revision. *Implementation notes* To fix it we need to get rid of zonesChangeTriggerKey and use a dataNodes ms key on a zone create and a zone drop. so the condition for a zone creation will be: {code:java} and( notExists(dataNodes(zoneId)), notTombstone(dataNodes(zoneId)) ){code} and for a zone drop: {code:java} exists(dataNodes(zoneId)){code} *Definition of done* Got rid of the meta storage zonesChangeTriggerKey key. was: *Motivation* Currently we use zonesChangeTriggerKey in DistributionZonesUtil#triggerKeyConditionForZonesChanges to create condition to initialize the zone's meta storage keys on a zone creation and to remove these keys on a zone drop. It cause some issues: # we cannot remove zonesChangeTriggerKey to ensure that on the zone will not recreated on DZM restart # it doesn't work properly now because it possible that on DZM restart the zone will be recreated with the revision which is higher than original the zone create revision. *Implementation notes* To fix it we need to get rid of zonesChangeTriggerKey and use a dataNodes ms key on a zone create and a zone drop. so the condition for a zone creation will be: and( notExists(dataNodes(zoneId)), notTombstone(dataNodes(zoneId)) ) and for a zone drop: exists(dataNodes(zoneId)) *Definition of done* Got rid of the meta storage zonesChangeTriggerKey key. > Change condition for DistributionZonesUtil#triggerKeyConditionForZonesChanges > to use ConditionType#TOMBSTONE > - > > Key: IGNITE-20561 > URL: https://issues.apache.org/jira/browse/IGNITE-20561 > Project: Ignite > Issue Type: Bug >Reporter: Mirza Aliev >Priority: Major > Labels: ignite-3 > > *Motivation* > Currently we use zonesChangeTriggerKey in > DistributionZonesUtil#triggerKeyConditionForZonesChanges to create condition > to initialize the zone's meta storage keys on a zone creation and to remove > these keys on a zone drop. It cause some issues: > # we cannot remove zonesChangeTriggerKey to ensure that on the zone will not > recreated on DZM restart > # it doesn't work properly now because it possible that on DZM restart the > zone will be recreated with the revision which is higher than original the > zone create revision. > *Implementation notes* > To fix it we need to get rid of zonesChangeTriggerKey and use a dataNodes ms > key on a zone create and a zone drop. > so the condition for a zone creation will be: > {code:java} > and( > notExists(dataNodes(zoneId)), > notTombstone(dataNodes(zoneId)) > ){code} > and for a zone drop: > {code:java} > exists(dataNodes(zoneId)){code} > > *Definition of done* > Got rid of the meta storage zonesChangeTriggerKey key. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20561) Change condition for DistributionZonesUtil#triggerKeyConditionForZonesChanges to use ConditionType#TOMBSTONE
[ https://issues.apache.org/jira/browse/IGNITE-20561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20561: --- Description: *Motivation* Currently we use zonesChangeTriggerKey in DistributionZonesUtil#triggerKeyConditionForZonesChanges to create condition to initialize the zone's meta storage keys on a zone creation and to remove these keys on a zone drop. It cause some issues: # we cannot remove zonesChangeTriggerKey to ensure that on the zone will not recreated on DZM restart # it doesn't work properly now because it possible that on DZM restart the zone will be recreated with the revision which is higher than original the zone create revision. *Implementation notes* To fix it we need to get rid of zonesChangeTriggerKey and use a dataNodes ms key on a zone create and a zone drop. so the condition for a zone creation will be: and( notExists(dataNodes(zoneId)), notTombstone(dataNodes(zoneId)) ) and for a zone drop: exists(dataNodes(zoneId)) *Definition of done* Got rid of the meta storage zonesChangeTriggerKey key. was:We need to use {{ConditionType#TOMBSTONE}} in {{DistributionZonesUtil#triggerKeyConditionForZonesChanges}} when we initialise keys for zones in MS > Change condition for DistributionZonesUtil#triggerKeyConditionForZonesChanges > to use ConditionType#TOMBSTONE > - > > Key: IGNITE-20561 > URL: https://issues.apache.org/jira/browse/IGNITE-20561 > Project: Ignite > Issue Type: Bug >Reporter: Mirza Aliev >Priority: Major > Labels: ignite-3 > > *Motivation* > Currently we use zonesChangeTriggerKey in > DistributionZonesUtil#triggerKeyConditionForZonesChanges to create condition > to initialize the zone's meta storage keys on a zone creation and to remove > these keys on a zone drop. It cause some issues: # we cannot remove > zonesChangeTriggerKey to ensure that on the zone will not recreated on DZM > restart > # it doesn't work properly now because it possible that on DZM restart the > zone will be recreated with the revision which is higher than original the > zone create revision. > > *Implementation notes* > To fix it we need to get rid of zonesChangeTriggerKey and use a dataNodes ms > key on a zone create and a zone drop. > so the condition for a zone creation will be: > and( > notExists(dataNodes(zoneId)), > notTombstone(dataNodes(zoneId)) > ) > and for a zone drop: > exists(dataNodes(zoneId)) > > *Definition of done* > Got rid of the meta storage zonesChangeTriggerKey key. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20536) No-op handlers for StripedDisruptor.StripeEntryHandler#subscribers
[ https://issues.apache.org/jira/browse/IGNITE-20536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20536: --- Description: h3. Motivation In https://issues.apache.org/jira/browse/IGNITE-20397 we discussed that it is possible to get null handler in StripedDisruptor.StripeEntryHandler#onEvent on a table drop. And we start to use a log warning instead of an assert. But this is not the best solution. We still need to assert that handler is not null on first event for the partition. And we need to skip events if the partition was removed. So we need: # to add assert that `handler != null`, # on StripedDisruptor.StripeEntryHandler#unsubscribe put a no-op handler to a subscribers map instead of remove it, # to remove the no-op handler when there are no events for this handler. h3. Definition of done: # assert that `handler != null` is added, # no-op handler on StripedDisruptor.StripeEntryHandler#unsubscribe, # remove handler when it is not needed was: h3. Motivation In https://issues.apache.org/jira/browse/IGNITE-20397 we discussed that it is possible to get null handler in StripedDisruptor.StripeEntryHandler#onEvent on a table drop. And we start to use a log warning instead of an assert. But this is not the best solution. We still need to assert that handler is not null on first event for the partition. And we need to skip events if the partition was removed. So we need: # to add assert that `handler != null`, # on StripedDisruptor.StripeEntryHandler#unsubscribe put a no-op handler to a subscribers map instead of remove it, # to remove the no-op handler when there are no events for this handler. h3. Definition of done: # assert that `handler != null` is added, # no-op handler on StripedDisruptor.StripeEntryHandler#unsubscribe, # remove handler when it is not needed. > No-op handlers for StripedDisruptor.StripeEntryHandler#subscribers > -- > > Key: IGNITE-20536 > URL: https://issues.apache.org/jira/browse/IGNITE-20536 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. Motivation > In https://issues.apache.org/jira/browse/IGNITE-20397 we discussed that it is > possible to get null handler in StripedDisruptor.StripeEntryHandler#onEvent > on a table drop. And we start to use a log warning instead of an assert. > But this is not the best solution. We still need to assert that handler is > not null on first event for the partition. And we need to skip events if the > partition was removed. So we need: > # to add assert that `handler != null`, > # on StripedDisruptor.StripeEntryHandler#unsubscribe put a no-op handler to > a subscribers map instead of remove it, > # to remove the no-op handler when there are no events for this handler. > h3. Definition of done: > # assert that `handler != null` is added, > # no-op handler on StripedDisruptor.StripeEntryHandler#unsubscribe, > # remove handler when it is not needed -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20536) No-op handlers for StripedDisruptor.StripeEntryHandler#subscribers
[ https://issues.apache.org/jira/browse/IGNITE-20536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20536: --- Ignite Flags: (was: Docs Required,Release Notes Required) > No-op handlers for StripedDisruptor.StripeEntryHandler#subscribers > -- > > Key: IGNITE-20536 > URL: https://issues.apache.org/jira/browse/IGNITE-20536 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. Motivation > In https://issues.apache.org/jira/browse/IGNITE-20397 we discussed that it is > possible to get null handler in StripedDisruptor.StripeEntryHandler#onEvent > on a table drop. And we start to use a log warning instead of an assert. > But this is not the best solution. We still need to assert that handler is > not null on first event for the partition. And we need to skip events if the > partition was removed. So we need: > # to add assert that `handler != null`, > # on StripedDisruptor.StripeEntryHandler#unsubscribe put a no-op handler to > a subscribers map instead of remove it, > # to remove the no-op handler when there are no events for this handler. > h3. Definition of done: > # assert that `handler != null` is added, > # no-op handler on StripedDisruptor.StripeEntryHandler#unsubscribe, > # remove handler when it is not needed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20397) java.lang.AssertionError: Group of the event is unsupported
[ https://issues.apache.org/jira/browse/IGNITE-20397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel reassigned IGNITE-20397: -- Assignee: Sergey Uttsel > java.lang.AssertionError: Group of the event is unsupported > --- > > Key: IGNITE-20397 > URL: https://issues.apache.org/jira/browse/IGNITE-20397 > Project: Ignite > Issue Type: Bug >Reporter: Alexander Lapin >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > h3. Motivation > {code:java} > java.lang.AssertionError: Group of the event is unsupported > [nodeId=<11_part_18/isaat_n_2>, > event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) > ~[disruptor-3.3.7.jar:?] > at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} > [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] > The root cause: > # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from > StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). > # In some cases the `subscribers` map is cleared by invocation of > StripedDisruptor.StripeEntryHandler#unsubscribe (for example on table > dropping), and then StripeEntryHandler receives event with > SafeTimeSyncCommandImpl. > # It produces an assertion error: `assert handler != null` > The issue is not caused by the catalog feature changes. > The issue is reproduced when I run the > ItSqlAsynchronousApiTest#batchIncomplete with RepeatedTest annotation. In > this case the cluster is not restarted after each tests. It possible to > reproduced it frequently if add Thread.sleep in StripeEntryHandler#onEvent. > h3. Implementation notes > We decided that we can use LOG.warn() instead of an assert because it is > safely to skip this event if the table was dropped. > {code:java} > if (handler != null) { > handler.onEvent(event, sequence, endOfBatch || subscribers.size() > 1 && > !supportsBatches); > } else { > LOG.warn(format("Group of the event is unsupported [nodeId={}, > event={}]", event.nodeId(), event)); > } {code} > It is temp solution and we need to add TODO with link > https://issues.apache.org/jira/browse/IGNITE-20536 > *Definition of done* > There is no asserts if handler is null. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20397) java.lang.AssertionError: Group of the event is unsupported
[ https://issues.apache.org/jira/browse/IGNITE-20397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20397: --- Description: h3. Motivation {code:java} java.lang.AssertionError: Group of the event is unsupported [nodeId=<11_part_18/isaat_n_2>, event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) ~[disruptor-3.3.7.jar:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] The root cause: # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). # In some cases the `subscribers` map is cleared by invocation of StripedDisruptor.StripeEntryHandler#unsubscribe (for example on table dropping), and then StripeEntryHandler receives event with SafeTimeSyncCommandImpl. # It produces an assertion error: `assert handler != null` The issue is not caused by the catalog feature changes. The issue is reproduced when I run the ItSqlAsynchronousApiTest#batchIncomplete with RepeatedTest annotation. In this case the cluster is not restarted after each tests. It possible to reproduced it frequently if add Thread.sleep in StripeEntryHandler#onEvent. h3. Implementation notes We decided that we can use LOG.warn() instead of an assert because it is safely to skip this event if the table was dropped. {code:java} if (handler != null) { handler.onEvent(event, sequence, endOfBatch || subscribers.size() > 1 && !supportsBatches); } else { LOG.warn(format("Group of the event is unsupported [nodeId={}, event={}]", event.nodeId(), event)); } {code} It is temp solution and we need to add TODO with link https://issues.apache.org/jira/browse/IGNITE-20536 *Definition of done* There is no asserts if handler is null. was: h3. Motivation {code:java} java.lang.AssertionError: Group of the event is unsupported [nodeId=<11_part_18/isaat_n_2>, event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) ~[disruptor-3.3.7.jar:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] The root cause: # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). # In some cases the `subscribers` map is cleared by invocation of StripedDisruptor.StripeEntryHandler#unsubscribe (for example on table dropping), and then StripeEntryHandler receives event with SafeTimeSyncCommandImpl. # It produces an assertion error: `assert handler != null` The issue is not caused by the catalog feature changes. The issue is reproduced when I run the ItSqlAsynchronousApiTest#batchIncomplete with RepeatedTest annotation. In this case the cluster is not restarted after each tests. It possible to reproduced it frequently if add Thread.sleep in StripeEntryHandler#onEvent. h3. Implementation notes We decided that we can use LOG.warn() instead of an assert because it is safely to skip this event if the table was dropped. {code:java} if (handler != null) { handler.onEvent(event, sequence, endOfBatch || subscribers.size() > 1 && !supportsBatches); } else { LOG.warn(format("Group of the event is unsupported [nodeId={}, event={}]", event.nodeId(), event)); } {code} *Definition of done* There is no asserts if handler is null. > java.lang.AssertionError: Group of the event is unsupported > --- > > Key: IGNITE-20397 > URL: https://issues.apache.org/jira/browse/IGNITE-20397 > Project: Ignite > Issue Type: Bug >Reporter: Alexander Lapin >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > h3. Motivation > {code:java} > java.lang.AssertionError: Group of the event is unsupported >
[jira] [Created] (IGNITE-20536) No-op handlers for StripedDisruptor.StripeEntryHandler#subscribers
Sergey Uttsel created IGNITE-20536: -- Summary: No-op handlers for StripedDisruptor.StripeEntryHandler#subscribers Key: IGNITE-20536 URL: https://issues.apache.org/jira/browse/IGNITE-20536 Project: Ignite Issue Type: Bug Reporter: Sergey Uttsel h3. Motivation In https://issues.apache.org/jira/browse/IGNITE-20397 we discussed that it is possible to get null handler in StripedDisruptor.StripeEntryHandler#onEvent on a table drop. And we start to use a log warning instead of an assert. But this is not the best solution. We still need to assert that handler is not null on first event for the partition. And we need to skip events if the partition was removed. So we need: # to add assert that `handler != null`, # on StripedDisruptor.StripeEntryHandler#unsubscribe put a no-op handler to a subscribers map instead of remove it, # to remove the no-op handler when there are no events for this handler. h3. Definition of done: # assert that `handler != null` is added, # no-op handler on StripedDisruptor.StripeEntryHandler#unsubscribe, # remove handler when it is not needed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20448) Implement strategies for failure handling
[ https://issues.apache.org/jira/browse/IGNITE-20448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20448: --- Reviewer: Vyacheslav Koptilin > Implement strategies for failure handling > - > > Key: IGNITE-20448 > URL: https://issues.apache.org/jira/browse/IGNITE-20448 > Project: Ignite > Issue Type: Improvement >Reporter: Vyacheslav Koptilin >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > Need to implement the following strategies for failure handling: > - StopNodeFailureHandler This handler should stop the node in case of a > critical error > - StopNodeOrHaltFailureHandler This handler should try to stop the node. If > the node cannot be stopped during a timeout, then the JVM process should be > stopped forcibly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20448) Implement strategies for failure handling
[ https://issues.apache.org/jira/browse/IGNITE-20448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel reassigned IGNITE-20448: -- Assignee: Sergey Uttsel (was: Vyacheslav Koptilin) > Implement strategies for failure handling > - > > Key: IGNITE-20448 > URL: https://issues.apache.org/jira/browse/IGNITE-20448 > Project: Ignite > Issue Type: Improvement >Reporter: Vyacheslav Koptilin >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > Need to implement the following strategies for failure handling: > - StopNodeFailureHandler This handler should stop the node in case of a > critical error > - StopNodeOrHaltFailureHandler This handler should try to stop the node. If > the node cannot be stopped during a timeout, then the JVM process should be > stopped forcibly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20448) Implement strategies for failure handling
[ https://issues.apache.org/jira/browse/IGNITE-20448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel reassigned IGNITE-20448: -- Assignee: Vyacheslav Koptilin (was: Sergey Uttsel) > Implement strategies for failure handling > - > > Key: IGNITE-20448 > URL: https://issues.apache.org/jira/browse/IGNITE-20448 > Project: Ignite > Issue Type: Improvement >Reporter: Vyacheslav Koptilin >Assignee: Vyacheslav Koptilin >Priority: Major > Labels: ignite-3 > Time Spent: 20m > Remaining Estimate: 0h > > Need to implement the following strategies for failure handling: > - StopNodeFailureHandler This handler should stop the node in case of a > critical error > - StopNodeOrHaltFailureHandler This handler should try to stop the node. If > the node cannot be stopped during a timeout, then the JVM process should be > stopped forcibly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20397) java.lang.AssertionError: Group of the event is unsupported
[ https://issues.apache.org/jira/browse/IGNITE-20397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20397: --- Description: h3. Motivation {code:java} java.lang.AssertionError: Group of the event is unsupported [nodeId=<11_part_18/isaat_n_2>, event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) ~[disruptor-3.3.7.jar:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] The root cause: # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). # In some cases the `subscribers` map is cleared by invocation of StripedDisruptor.StripeEntryHandler#unsubscribe (for example on table dropping), and then StripeEntryHandler receives event with SafeTimeSyncCommandImpl. # It produces an assertion error: `assert handler != null` The issue is not caused by the catalog feature changes. The issue is reproduced when I run the ItSqlAsynchronousApiTest#batchIncomplete with RepeatedTest annotation. In this case the cluster is not restarted after each tests. It possible to reproduced it frequently if add Thread.sleep in StripeEntryHandler#onEvent. h3. Implementation notes We decided that we can use LOG.warn() instead of an assert because it is safely to skip this event if the table was dropped. {code:java} if (handler != null) { handler.onEvent(event, sequence, endOfBatch || subscribers.size() > 1 && !supportsBatches); } else { LOG.warn(format("Group of the event is unsupported [nodeId={}, event={}]", event.nodeId(), event)); } {code} *Definition of done* There is no asserts if handler is null. was: {code:java} java.lang.AssertionError: Group of the event is unsupported [nodeId=<11_part_18/isaat_n_2>, event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) ~[disruptor-3.3.7.jar:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] The root cause: # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). # In some cases the `subscribers` map is cleared by invocation of StripedDisruptor.StripeEntryHandler#unsubscribe (for example on table dropping), and then StripeEntryHandler receives event with SafeTimeSyncCommandImpl. # It produces an assertion error: `assert handler != null` The issue is not caused by the catalog feature changes. The issue is reproduced when I run the ItSqlAsynchronousApiTest#batchIncomplete with RepeatedTest annotation. In this case the cluster is not restarted after each tests. It possible to reproduced it frequently if add Thread.sleep in StripeEntryHandler#onEvent. We decided that we can use LOG.warn() instead of an assert: {code:java} if (handler != null) { handler.onEvent(event, sequence, endOfBatch || subscribers.size() > 1 && !supportsBatches); } else { LOG.warn(format("Group of the event is unsupported [nodeId={}, event={}]", event.nodeId(), event)); } {code} > java.lang.AssertionError: Group of the event is unsupported > --- > > Key: IGNITE-20397 > URL: https://issues.apache.org/jira/browse/IGNITE-20397 > Project: Ignite > Issue Type: Bug >Reporter: Alexander Lapin >Priority: Major > Labels: ignite-3 > > h3. Motivation > {code:java} > java.lang.AssertionError: Group of the event is unsupported > [nodeId=<11_part_18/isaat_n_2>, > event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at >
[jira] [Assigned] (IGNITE-20448) Implement strategies for failure handling
[ https://issues.apache.org/jira/browse/IGNITE-20448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel reassigned IGNITE-20448: -- Assignee: Sergey Uttsel > Implement strategies for failure handling > - > > Key: IGNITE-20448 > URL: https://issues.apache.org/jira/browse/IGNITE-20448 > Project: Ignite > Issue Type: Improvement >Reporter: Vyacheslav Koptilin >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > Need to implement the following strategies for failure handling: > - StopNodeFailureHandler This handler should stop the node in case of a > critical error > - StopNodeOrHaltFailureHandler This handler should try to stop the node. If > the node cannot be stopped during a timeout, then the JVM process should be > stopped forcibly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20447) Introduce a new failure handling component
[ https://issues.apache.org/jira/browse/IGNITE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20447: --- Reviewer: Mirza Aliev > Introduce a new failure handling component > -- > > Key: IGNITE-20447 > URL: https://issues.apache.org/jira/browse/IGNITE-20447 > Project: Ignite > Issue Type: Improvement >Reporter: Vyacheslav Koptilin >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > Let's add a new component `failure` to Apache Ignite 3 and add base > interfaces to this component. > *Definition of done:* > - introduced a new module to Ignite 3 codebase > - introduced a new Ignite component - _FailureProcessor _with minimal no-op > implementation. This component is responsible for processing critical errors. > - introduced a new _FailureHandler _interface. An implementation of this > interface represents a concrete strategy for handling errors. > - introduced a new enum _FailureType _that describes a possible type of > failure. The following types can be considered as a starting point: > _CRITICAL_ERROR_, _SYSTEM_WORKER_TERMINATION_, _SYSTEM_WORKER_BLOCKED_, > _SYSTEM_CRITICAL_OPERATION_TIMEOUT_ > - introduced a new class _FailureContext _that contains information about > failure type and exception. > *Implemenattion notes:* > All these classes and interfaces should be a part of internal API due to > the end user should not provide a custom implementation of the failure > handler, Apache Ignite should provide a closed list of handlers out of the > box. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20397) java.lang.AssertionError: Group of the event is unsupported
[ https://issues.apache.org/jira/browse/IGNITE-20397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20397: --- Description: {code:java} java.lang.AssertionError: Group of the event is unsupported [nodeId=<11_part_18/isaat_n_2>, event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) ~[disruptor-3.3.7.jar:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] The root cause: # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). # In some cases the `subscribers` map is cleared by invocation of StripedDisruptor.StripeEntryHandler#unsubscribe, and then StripeEntryHandler receives event with SafeTimeSyncCommandImpl. # It produces an assertion error: `assert handler != null` The issue is not caused by the catalog feature changes. It possible to reproduced it if add Thread.sleep in StripeEntryHandler#onEvent. UPD: The issue is reproduced when I run the ItSqlAsynchronousApiTest#batchIncomplete with RepeatedTest annotation. In this case the cluster is not restarted after each tests. When I change test class to "start cluster, create table, drop table, stop cluster" then the issue is not reproduced. We decided that we can use LOG.warn() instead of an assert: * If handler == null then print assert * else do handler.onEvent was: {code:java} java.lang.AssertionError: Group of the event is unsupported [nodeId=<11_part_18/isaat_n_2>, event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) ~[disruptor-3.3.7.jar:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] The root cause: # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). # In some cases the `subscribers` map is cleared by invocation of StripedDisruptor.StripeEntryHandler#unsubscribe, and then StripeEntryHandler receives event with SafeTimeSyncCommandImpl. # It produces an assertion error: `assert handler != null` The issue is not caused by the catalog feature changes. It possible to reproduced it if add Thread.sleep in StripeEntryHandler#onEvent. Originally it was reproduced on a table dropping. But it possible to reproduce it on a table creation if set "IDLE_SAFE_TIME_PROPAGATION_PERIOD_MILLISECONDS = 500;". > java.lang.AssertionError: Group of the event is unsupported > --- > > Key: IGNITE-20397 > URL: https://issues.apache.org/jira/browse/IGNITE-20397 > Project: Ignite > Issue Type: Bug >Reporter: Alexander Lapin >Priority: Major > Labels: ignite-3 > > {code:java} > java.lang.AssertionError: Group of the event is unsupported > [nodeId=<11_part_18/isaat_n_2>, > event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) > ~[disruptor-3.3.7.jar:?] > at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} > [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] > The root cause: > # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from > StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). > # In some cases the `subscribers` map is cleared by
[jira] [Updated] (IGNITE-20397) java.lang.AssertionError: Group of the event is unsupported
[ https://issues.apache.org/jira/browse/IGNITE-20397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20397: --- Description: {code:java} java.lang.AssertionError: Group of the event is unsupported [nodeId=<11_part_18/isaat_n_2>, event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) ~[disruptor-3.3.7.jar:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] The root cause: # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). # In some cases the `subscribers` map is cleared by invocation of StripedDisruptor.StripeEntryHandler#unsubscribe, and then StripeEntryHandler receives event with SafeTimeSyncCommandImpl. # It produces an assertion error: `assert handler != null` The issue is not caused by the catalog feature changes. It possible to reproduced it if add Thread.sleep in StripeEntryHandler#onEvent. Originally it was reproduced on a table dropping. But it possible to reproduce it on a table creation if set "IDLE_SAFE_TIME_PROPAGATION_PERIOD_MILLISECONDS = 500;". was: {code:java} java.lang.AssertionError: Group of the event is unsupported [nodeId=<11_part_18/isaat_n_2>, event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) ~[disruptor-3.3.7.jar:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] The root cause: # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). # In some cases the `subscribers` map is cleared by invocation of StripedDisruptor.StripeEntryHandler#unsubscribe, and then StripeEntryHandler receives event with SafeTimeSyncCommandImpl. # It produces an assertion error: `assert handler != null` > java.lang.AssertionError: Group of the event is unsupported > --- > > Key: IGNITE-20397 > URL: https://issues.apache.org/jira/browse/IGNITE-20397 > Project: Ignite > Issue Type: Bug >Reporter: Alexander Lapin >Priority: Major > Labels: ignite-3 > > {code:java} > java.lang.AssertionError: Group of the event is unsupported > [nodeId=<11_part_18/isaat_n_2>, > event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) > ~[disruptor-3.3.7.jar:?] > at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} > [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] > The root cause: > # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from > StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). > # In some cases the `subscribers` map is cleared by invocation of > StripedDisruptor.StripeEntryHandler#unsubscribe, and then StripeEntryHandler > receives event with SafeTimeSyncCommandImpl. > # It produces an assertion error: `assert handler != null` > The issue is not caused by the catalog feature changes. > It possible to reproduced it if add Thread.sleep in > StripeEntryHandler#onEvent. > Originally it was reproduced on a table dropping. But it possible to > reproduce it on a table creation if set > "IDLE_SAFE_TIME_PROPAGATION_PERIOD_MILLISECONDS = 500;". -- This message was sent by Atlassian Jira
[jira] [Updated] (IGNITE-20412) Fix ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
[ https://issues.apache.org/jira/browse/IGNITE-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20412: --- Description: h3. Motivation org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart started to fall in the catalog-feature branch and fails in the main branch after catalog-feature is merged [https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=] {code:java} java.lang.AssertionError: Expected: is <[]> but: was <[A]> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459) at org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539) {code} h3. Implementation notes The root cause: # This test changes metaStorageManager behavior and it throws expected exception on ms.invoke. # The test alters zone with new filter. # DistributionZoneManager#onUpdateFilter return a future from saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken) # The future is completed exceptionally and WatchProcessor#notificationFuture will be completed exceptionally. # Next updates will not be handled properly because notificationFuture is completed exceptionally. We have already created tickets obout exception handling: * https://issues.apache.org/jira/browse/IGNITE-14693 * https://issues.apache.org/jira/browse/IGNITE-14611 The test scenario is incorrect because the node should be stopped (by failure handler) if the ms.invoke failed. We need to rewrite it when the DZM restart will be updated. was: h3. Motivation org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart started to fall in the catalog-feature branch and fails in the main branch after catalog-feature is merged https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests= {code:java} java.lang.AssertionError: Expected: is <[]> but: was <[A]> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459) at org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539) {code} h3. Implementation notes The root cause: # This test changes metaStorageManager behavior and it throws expected exception on ms.invoke. # The test alters zone with new filter. # DistributionZoneManager#onUpdateFilter return a future from saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken) # The future is completed exceptionally and WatchProcessor#notificationFuture will be completed exceptionally. # Next updates will not be handled properly because notificationFuture is completed exceptionally. We have already created tickets obout exception handling: * https://issues.apache.org/jira/browse/IGNITE-14693 * https://issues.apache.org/jira/browse/IGNITE-14611 I think the test scenario is incorrect because the node should be stopped (by failure handler) if the ms.invoke failed. > Fix > ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart > > > Key: IGNITE-20412 > URL: https://issues.apache.org/jira/browse/IGNITE-20412 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > h3. Motivation > org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart > started to fall in the catalog-feature branch and fails in the main branch > after catalog-feature is merged > [https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests=] > {code:java} > java.lang.AssertionError: > Expected: is <[]> > but: was <[A]> > at
[jira] [Assigned] (IGNITE-20447) Introduce a new failure handling component
[ https://issues.apache.org/jira/browse/IGNITE-20447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel reassigned IGNITE-20447: -- Assignee: Sergey Uttsel > Introduce a new failure handling component > -- > > Key: IGNITE-20447 > URL: https://issues.apache.org/jira/browse/IGNITE-20447 > Project: Ignite > Issue Type: Improvement >Reporter: Vyacheslav Koptilin >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > Let's add a new component `failure` to Apache Ignite 3 and add base > interfaces to this component. > *Definition of done:* > - introduced a new module to Ignite 3 codebase > - introduced a new Ignite component - _FailureProcessor _with minimal no-op > implementation. This component is responsible for processing critical errors. > - introduced a new _FailureHandler _interface. An implementation of this > interface represents a concrete strategy for handling errors. > - introduced a new enum _FailureType _that describes a possible type of > failure. The following types can be considered as a starting point: > _CRITICAL_ERROR_, _SYSTEM_WORKER_TERMINATION_, _SYSTEM_WORKER_BLOCKED_, > _SYSTEM_CRITICAL_OPERATION_TIMEOUT_ > - introduced a new class _FailureContext _that contains information about > failure type and exception. > *Implemenattion notes:* > All these classes and interfaces should be a part of internal API due to > the end user should not provide a custom implementation of the failure > handler, Apache Ignite should provide a closed list of handlers out of the > box. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20397) java.lang.AssertionError: Group of the event is unsupported
[ https://issues.apache.org/jira/browse/IGNITE-20397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20397: --- Description: {code:java} java.lang.AssertionError: Group of the event is unsupported [nodeId=<11_part_18/isaat_n_2>, event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) ~[disruptor-3.3.7.jar:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] The root cause: # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). # In some cases the `subscribers` map is cleared by invocation of StripedDisruptor.StripeEntryHandler#unsubscribe, and then StripeEntryHandler receives event with SafeTimeSyncCommandImpl. # It produces an assertion error: `assert handler != null` was: {code:java} java.lang.AssertionError: Group of the event is unsupported [nodeId=<11_part_18/isaat_n_2>, event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) ~[disruptor-3.3.7.jar:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true > java.lang.AssertionError: Group of the event is unsupported > --- > > Key: IGNITE-20397 > URL: https://issues.apache.org/jira/browse/IGNITE-20397 > Project: Ignite > Issue Type: Bug >Reporter: Alexander Lapin >Priority: Major > Labels: ignite-3 > > {code:java} > java.lang.AssertionError: Group of the event is unsupported > [nodeId=<11_part_18/isaat_n_2>, > event=org.apache.ignite.raft.jraft.core.NodeImpl$LogEntryAndClosure@653d84a] > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:224) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:191) > ~[ignite-raft-3.0.0-SNAPSHOT.jar:?] > at > com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) > ~[disruptor-3.3.7.jar:?] > at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} > [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7498320?expandCode+Inspection=true=true=false=true=false=true] > The root cause: > # StripedDisruptor.StripeEntryHandler#onEvent method gets handler from > StripedDisruptor.StripeEntryHandler#subscribers by event.nodeId(). > # In some cases the `subscribers` map is cleared by invocation of > StripedDisruptor.StripeEntryHandler#unsubscribe, and then StripeEntryHandler > receives event with SafeTimeSyncCommandImpl. > # It produces an assertion error: `assert handler != null` -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20412) Fix ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
[ https://issues.apache.org/jira/browse/IGNITE-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20412: --- Description: h3. Motivation org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart started to fall in the catalog-feature branch and fails in the main branch after catalog-feature is merged https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests= {code:java} java.lang.AssertionError: Expected: is <[]> but: was <[A]> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) at org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459) at org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539) {code} h3. Implementation notes The root cause: # This test changes metaStorageManager behavior and it throws expected exception on ms.invoke. # The test alters zone with new filter. # DistributionZoneManager#onUpdateFilter return a future from saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken) # The future is completed exceptionally and WatchProcessor#notificationFuture will be completed exceptionally. # Next updates will not be handled properly because notificationFuture is completed exceptionally. We have already created tickets obout exception handling: * https://issues.apache.org/jira/browse/IGNITE-14693 * https://issues.apache.org/jira/browse/IGNITE-14611 I think the test scenario is incorrect because the node should be stopped (by failure handler) if the ms.invoke failed. was: *org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart* started to fall in the [catalog-feature|https://github.com/apache/ignite-3/tree/catalog-feature] branch, and on other branches that are created from it branch, need to fix it. https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests= > Fix > ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart > > > Key: IGNITE-20412 > URL: https://issues.apache.org/jira/browse/IGNITE-20412 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > h3. Motivation > org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart > started to fall in the catalog-feature branch and fails in the main branch > after catalog-feature is merged > https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests= > {code:java} > java.lang.AssertionError: > Expected: is <[]> > but: was <[A]> > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) > at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6) > at > org.apache.ignite.internal.distributionzones.DistributionZonesTestUtil.assertValueInStorage(DistributionZonesTestUtil.java:459) > at > org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest.testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart(ItIgniteDistributionZoneManagerNodeRestartTest.java:539) > {code} > h3. Implementation notes > The root cause: > # This test changes metaStorageManager behavior and it throws expected > exception on ms.invoke. > # The test alters zone with new filter. > # DistributionZoneManager#onUpdateFilter return a future from > saveDataNodesToMetaStorageOnScaleUp(zoneId, causalityToken) > # The future is completed exceptionally and WatchProcessor#notificationFuture > will be completed exceptionally. > # Next updates will not be handled properly because notificationFuture is > completed exceptionally. > We have already created tickets obout exception handling: > * https://issues.apache.org/jira/browse/IGNITE-14693 > * https://issues.apache.org/jira/browse/IGNITE-14611 > I think the test scenario is incorrect because the node should be stopped (by > failure handler) if the ms.invoke failed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20412) Fix ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
[ https://issues.apache.org/jira/browse/IGNITE-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20412: --- Summary: Fix ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart (was: Fix ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart again) > Fix > ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart > > > Key: IGNITE-20412 > URL: https://issues.apache.org/jira/browse/IGNITE-20412 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > *org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart* > started to fall in the > [catalog-feature|https://github.com/apache/ignite-3/tree/catalog-feature] > branch, and on other branches that are created from it branch, need to fix it. > https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests= -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20412) Fix ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart
[ https://issues.apache.org/jira/browse/IGNITE-20412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20412: --- Description: *org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart* started to fall in the [catalog-feature|https://github.com/apache/ignite-3/tree/catalog-feature] branch, and on other branches that are created from it branch, need to fix it. https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests= was: *org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart* started to fall in the [catalog-feature|https://github.com/apache/ignite-3/tree/catalog-feature] branch, and on other branches that are created from it branch, need to fix it. https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests= > Fix > ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart > > > Key: IGNITE-20412 > URL: https://issues.apache.org/jira/browse/IGNITE-20412 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Fix For: 3.0.0-beta2 > > > *org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpsTriggeredByFilterUpdateAndNodeJoinAreRestoredAfterRestart* > started to fall in the > [catalog-feature|https://github.com/apache/ignite-3/tree/catalog-feature] > branch, and on other branches that are created from it branch, need to fix it. > https://ci.ignite.apache.org/viewLog.html?buildId=7501721=buildResultsDiv=ApacheIgnite3xGradle_Test_RunAllTests= -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20332) Fix ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart
[ https://issues.apache.org/jira/browse/IGNITE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764063#comment-17764063 ] Sergey Uttsel commented on IGNITE-20332: I fixed it in the catalog-future branch. Also I fixed flakiness in the main branch. > Fix > ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart > --- > > Key: IGNITE-20332 > URL: https://issues.apache.org/jira/browse/IGNITE-20332 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 40m > Remaining Estimate: 0h > > *org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart* > started to fall in the > [catalog-feature|https://github.com/apache/ignite-3/tree/catalog-feature] > branch, and on other branches that are created from it branch, need to fix it. > https://ci.ignite.apache.org/viewLog.html?buildId=7470189=ApacheIgnite3xGradle_Test_RunAllTests=true -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20332) Fix ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart
[ https://issues.apache.org/jira/browse/IGNITE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20332: --- Reviewer: Mirza Aliev > Fix > ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart > --- > > Key: IGNITE-20332 > URL: https://issues.apache.org/jira/browse/IGNITE-20332 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > *org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart* > started to fall in the > [catalog-feature|https://github.com/apache/ignite-3/tree/catalog-feature] > branch, and on other branches that are created from it branch, need to fix it. > https://ci.ignite.apache.org/viewLog.html?buildId=7470189=ApacheIgnite3xGradle_Test_RunAllTests=true -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20332) Fix ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart
[ https://issues.apache.org/jira/browse/IGNITE-20332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel reassigned IGNITE-20332: -- Assignee: Sergey Uttsel > Fix > ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart > --- > > Key: IGNITE-20332 > URL: https://issues.apache.org/jira/browse/IGNITE-20332 > Project: Ignite > Issue Type: Improvement >Reporter: Kirill Tkalenko >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > *org.apache.ignite.internal.distribution.zones.ItIgniteDistributionZoneManagerNodeRestartTest#testScaleUpTriggeredByFilterUpdateIsRestoredAfterRestart* > started to fall in the > [catalog-feature|https://github.com/apache/ignite-3/tree/catalog-feature] > branch, and on other branches that are created from it branch, need to fix it. > https://ci.ignite.apache.org/viewLog.html?buildId=7470189=ApacheIgnite3xGradle_Test_RunAllTests=true -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20317: --- Description: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on replicas update. # LogicalTopologyEventListener to update logical topology. # DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* * ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, DistributionZoneManager#onUpdateFilter and DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. * We cannnot return future from LogicalTopologyEventListener's methods. We can ignore these futures. It has drawback: we can skip the topology update # topology=[A,B], dataNodes=[A,B], scaleUp=0, scaleDown=100 # Node C was joined to the topology and left quickly and ms invokes to update topology entry was reordered. # data nodes was not updated immediately to [A,B,C]. We think that we can ignore this bug because eventually it doesn't break the consistency of the date node. For this purpose we need to change the invoke condition: `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` instead of `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() - 1))` * Need to return futures from WatchListener#onUpdate method of the data nodes listener. was: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on replicas update. # LogicalTopologyEventListener to update logical topology. # DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, DistributionZoneManager#onUpdateFilter and DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. We cannnot return future from LogicalTopologyEventListener's methods. We can ignore these futures. It has drawback: we can skip the topology update # topology=[A,B], dataNodes=[A,B], scaleUp=0, scaleDown=100 # Node C was joined to the topology and left quickly and ms invokes to update topology entry was reordered. # data nodes was not updated immediately to [A,B,C]. We think that we can ignore this bug because eventually it doesn't break the consistency of the date node. For this purpose we need to change the invoke condition: `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` instead of `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() - 1))` Need to return futures from WatchListener#onUpdate method of the data nodes listener. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so
[jira] [Updated] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20317: --- Description: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on replicas update. # LogicalTopologyEventListener to update logical topology. # DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* * ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, DistributionZoneManager#onUpdateFilter and DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. * We cannnot return future from LogicalTopologyEventListener's methods. We can ignore these futures. It has drawback: we can skip the topology update # topology=[A,B], dataNodes=[A,B], scaleUp=0, scaleDown=100 # Node C was joined to the topology and left quickly and ms invokes to update topology entry was reordered. # data nodes was not updated immediately to [A,B,C]. We think that we can ignore this bug because eventually it doesn't break the consistency of the date node. For this purpose we need to change the invoke condition: `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` instead of `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() - 1))` * Need to return ms invoke futures from WatchListener#onUpdate method of the data nodes listener. was: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on replicas update. # LogicalTopologyEventListener to update logical topology. # DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* * ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, DistributionZoneManager#onUpdateFilter and DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. * We cannnot return future from LogicalTopologyEventListener's methods. We can ignore these futures. It has drawback: we can skip the topology update # topology=[A,B], dataNodes=[A,B], scaleUp=0, scaleDown=100 # Node C was joined to the topology and left quickly and ms invokes to update topology entry was reordered. # data nodes was not updated immediately to [A,B,C]. We think that we can ignore this bug because eventually it doesn't break the consistency of the date node. For this purpose we need to change the invoke condition: `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` instead of `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() - 1))` * Need to return futures from WatchListener#onUpdate method of the data nodes listener. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are
[jira] [Updated] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20317: --- Description: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on replicas update. # LogicalTopologyEventListener to update logical topology. # DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, DistributionZoneManager#onUpdateFilter and DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. We cannnot return future from LogicalTopologyEventListener's methods. We can ignore these futures. It has drawback: we can skip the topology update # topology=[A,B], dataNodes=[A,B], scaleUp=0, scaleDown=100 # Node C was joined to the topology and left quickly and ms invokes to update topology entry was reordered. # data nodes was not updated immediately to [A,B,C]. We think that we can ignore this bug because eventually it doesn't break the consistency of the date node. For this purpose we need to change the invoke condition: `value(zonesLogicalTopologyVersionKey()).lt(longToBytes(newTopology.version()))` instead of `value(zonesLogicalTopologyVersionKey()).eq(longToBytes(newTopology.version() - 1))` Need to return futures from WatchListener#onUpdate method of the data nodes listener. was: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on replicas update. # LogicalTopologyEventListener to update logical topology. # DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, DistributionZoneManager#onUpdateFilter and DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. # We cannnot return future from LogicalTopologyEventListener's methods. So we can chain their ms invokes futures in DZM or we can add tasks with ms invoke to executor. # Need to return futures from WatchListener#onUpdate method of the data nodes listener. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on > replicas update. > #
[jira] [Updated] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20317: --- Description: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on replicas update. # LogicalTopologyEventListener to update logical topology. # DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener watch listener to update pending assignments. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, DistributionZoneManager#onUpdateFilter and DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. # We cannnot return future from LogicalTopologyEventListener's methods. So we can chain their ms invokes futures in DZM or we can add tasks with ms invoke to executor. # Need to return futures from WatchListener#onUpdate method of the data nodes listener. was: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on replicas update. # LogicalTopologyEventListener to update logical topology. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, DistributionZoneManager#onUpdateFilter and DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. # We cannnot return future from LogicalTopologyEventListener's methods. So we can chain their ms invokes futures in DZM or we can add tasks with ms invoke to executor. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on > replicas update. > # LogicalTopologyEventListener to update logical topology. > # DistributionZoneRebalanceEngine#createDistributionZonesDataNodesListener > watch listener to update pending assignments. > h3. *Definition of Done* > Need to ensure event handling linearization. > h3. *Implementation Notes* > # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, > DistributionZoneManager#onUpdateFilter and > DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration > listeners. So we can just return the ms invoke future from these methods > and it ensure, that this invoke will be completed within the current event > handling. > # We cannnot return future from LogicalTopologyEventListener's methods. So we > can chain their ms invokes
[jira] [Updated] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20317: --- Description: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on replicas update. # LogicalTopologyEventListener to update logical topology. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, DistributionZoneManager#onUpdateFilter and DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. # We cannnot return future from LogicalTopologyEventListener's methods. So we can chain their ms invokes futures in DZM or we can add tasks with ms invoke to executor. was: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # LogicalTopologyEventListener to update logical topology. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete and DistributionZoneManager#onUpdateFilter are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. # We cannnot return future from LogicalTopologyEventListener's methods. So we can chain their ms invokes futures in DZM or we can add tasks with ms invoke to executor. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > # DistributionZoneRebalanceEngine#onUpdateReplicas to apdate assignment on > replicas update. > # LogicalTopologyEventListener to update logical topology. > h3. *Definition of Done* > Need to ensure event handling linearization. > h3. *Implementation Notes* > # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete, > DistributionZoneManager#onUpdateFilter and > DistributionZoneRebalanceEngine#onUpdateReplicas are invoked in configuration > listeners. So we can just return the ms invoke future from these methods > and it ensure, that this invoke will be completed within the current event > handling. > # We cannnot return future from LogicalTopologyEventListener's methods. So we > can chain their ms invokes futures in DZM or we can add tasks with ms invoke > to executor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20317: --- Description: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # LogicalTopologyEventListener to update logical topology. h3. *Definition of Done* Need to ensure event handling linearization. h3. *Implementation Notes* # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete and DistributionZoneManager#onUpdateFilter are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. # We cannnot return future from LogicalTopologyEventListener's methods. So we can chain their ms invokes futures in DZM or we can add tasks with ms invoke to executor. was: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # LogicalTopologyEventListener to update logical topology. h3. *Definition of Done* Need to return meta storage futures from event handlers to ensure event linearization. h3. *Implementation Notes* # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete and DistributionZoneManager#onUpdateFilter are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. # We cannnot return future from LogicalTopologyEventListener's methods. So we can chain their ms invokes futures in DZM or we can add tasks with ms invoke to executor. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > # LogicalTopologyEventListener to update logical topology. > h3. *Definition of Done* > Need to ensure event handling linearization. > h3. *Implementation Notes* > # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete > and DistributionZoneManager#onUpdateFilter are invoked in configuration > listeners. So we can just return the ms invoke future from these methods > and it ensure, that this invoke will be completed within the current event > handling. > # We cannnot return future from LogicalTopologyEventListener's methods. So we > can chain their ms invokes futures in DZM or we can add tasks with ms invoke > to executor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20317: --- Description: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # LogicalTopologyEventListener to update logical topology. h3. *Definition of Done* Need to return meta storage futures from event handlers to ensure event linearization. h3. *Implementation Notes* # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete and DistributionZoneManager#onUpdateFilter are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. # We cannnot return future from LogicalTopologyEventListener's methods. So we can chain their ms invokes futures in DZM or we can add tasks with ms invoke to executor. was: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # LogicalTopologyEventListener to update logical topology. h3. *Definition of Done* Need to return meta storage futures from event handlers to ensure event linearization. h3. *Implementation Notes* ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete and DistributionZoneManager#onUpdateFilter are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > # LogicalTopologyEventListener to update logical topology. > h3. *Definition of Done* > Need to return meta storage futures from event handlers to ensure event > linearization. > h3. *Implementation Notes* > # ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete > and DistributionZoneManager#onUpdateFilter are invoked in configuration > listeners. So we can just return the ms invoke future from these methods > and it ensure, that this invoke will be completed within the current event > handling. > # We cannnot return future from LogicalTopologyEventListener's methods. So we > can chain their ms invokes futures in DZM or we can add tasks with ms invoke > to executor. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20317: --- Description: h3. *Motivation* There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Therefore several invokes for example on createZone and alterZone can be reordered. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # LogicalTopologyEventListener to update logical topology. h3. *Definition of Done* Need to return meta storage futures from event handlers to ensure event linearization. h3. *Implementation Notes* ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete and DistributionZoneManager#onUpdateFilter are invoked in configuration listeners. So we can just return the ms invoke future from these methods and it ensure, that this invoke will be completed within the current event handling. was: There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # LogicalTopologyEventListener to update logical topology. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. Need to return meta storage futures from event handlers to ensure event linearization. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Therefore > several invokes for example on createZone and alterZone can be reordered. > Currently it does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > # LogicalTopologyEventListener to update logical topology. > h3. *Definition of Done* > Need to return meta storage futures from event handlers to ensure event > linearization. > h3. *Implementation Notes* > ZonesConfigurationListener#onCreate, ZonesConfigurationListener#onDelete and > DistributionZoneManager#onUpdateFilter are invoked in configuration > listeners. So we can just return the ms invoke future from these methods > and it ensure, that this invoke will be completed within the current event > handling. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
[ https://issues.apache.org/jira/browse/IGNITE-20317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20317: --- Description: There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # LogicalTopologyEventListener to update logical topology. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. Need to return meta storage futures from event handlers to ensure event linearization. was: There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # LogicalTopologyEventListener to update logical topology. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # Also saveDataNodesToMetaStorageOnScaleUp and saveDataNodesToMetaStorageOnScaleDown do invokes. Need to return meta storage futures from event handlers to ensure event linearization. > Meta storage invokes are not completed when events are handled in DZM > -- > > Key: IGNITE-20317 > URL: https://issues.apache.org/jira/browse/IGNITE-20317 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > There are meta storage invokes in DistributionZoneManager in zone's > lifecycle. The futures of these invokes are ignored, so after the lifecycle > method is completed actually not all its actions are completed. Currently it > does the meta storage invokes in: > # ZonesConfigurationListener#onCreate to init a zone. > # ZonesConfigurationListener#onDelete to clean up the zone data. > # LogicalTopologyEventListener to update logical topology. > # DistributionZoneManager#onUpdateFilter to save data nodes in the meta > storage. > Need to return meta storage futures from event handlers to ensure event > linearization. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-20326) Meta storage invokes are not completed when data nodes are recalculated in DZM
Sergey Uttsel created IGNITE-20326: -- Summary: Meta storage invokes are not completed when data nodes are recalculated in DZM Key: IGNITE-20326 URL: https://issues.apache.org/jira/browse/IGNITE-20326 Project: Ignite Issue Type: Bug Reporter: Sergey Uttsel There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Such invokes is used in: # DistributionZoneManager#saveDataNodesToMetaStorageOnScaleUp # DistributionZoneManager#saveDataNodesToMetaStorageOnScaleDown to recalculate data nodes when timers are fired. Need to check do we need to await futures from these invokes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20303) "Raft group on the node is already started" exception when pending and planned assignment changed faster then rebalance
[ https://issues.apache.org/jira/browse/IGNITE-20303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20303: --- Reviewer: Kirill Gusakov > "Raft group on the node is already started" exception when pending and > planned assignment changed faster then rebalance > --- > > Key: IGNITE-20303 > URL: https://issues.apache.org/jira/browse/IGNITE-20303 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > If many changes of assignment are happened quickly then rebalance does not > have time to be completed for each change. In this case exception is thrown: > {code:java} > 2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_2%tableManager-io-10][WatchProcessor] > Error occurred when processing a watch event > org.apache.ignite.lang.IgniteInternalException: Raft group on the node is > already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer > [consistentId=irdt_ttqr_2, idx=0]]] > at > org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) > ~[main/:?] > at > org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) > ~[main/:?] > at > org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) > ~[main/:?] > at > org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361) > ~[main/:?] > at > org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261) > ~[main/:?] > at > org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) > ~[main/:?] > at > org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259) > ~[main/:?] > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] > at java.lang.Thread.run(Thread.java:834) ~[?:?] > {code} > The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances. > See exception in the test log: > {code:java} > @Test > void testThreeQueuedRebalances() throws Exception { > Node node = getNode(0); > createZone(node, ZONE_NAME, 1, 1); > createTable(node, ZONE_NAME, TABLE_NAME); > assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, > 0).size() == 1, AWAIT_TIMEOUT_MILLIS)); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > waitPartitionAssignmentsSyncedToExpected(0, 2); > checkPartitionNodes(0, 2); > } > {code} > We can fix it by a check if the raft node and the Replica are created before > startPartitionRaftGroupNode and startReplicaWithNewListener in > TableManager#handleChangePendingAssignmentEvent. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-20317) Meta storage invokes are not completed when events are handled in DZM
Sergey Uttsel created IGNITE-20317: -- Summary: Meta storage invokes are not completed when events are handled in DZM Key: IGNITE-20317 URL: https://issues.apache.org/jira/browse/IGNITE-20317 Project: Ignite Issue Type: Bug Reporter: Sergey Uttsel There are meta storage invokes in DistributionZoneManager in zone's lifecycle. The futures of these invokes are ignored, so after the lifecycle method is completed actually not all its actions are completed. Currently it does the meta storage invokes in: # ZonesConfigurationListener#onCreate to init a zone. # ZonesConfigurationListener#onDelete to clean up the zone data. # LogicalTopologyEventListener to update logical topology. # DistributionZoneManager#onUpdateFilter to save data nodes in the meta storage. # Also saveDataNodesToMetaStorageOnScaleUp and saveDataNodesToMetaStorageOnScaleDown do invokes. Need to return meta storage futures from event handlers to ensure event linearization. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20310) Meta storage invokes are not completed when DZM start is completed
[ https://issues.apache.org/jira/browse/IGNITE-20310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20310: --- Description: There are meta storage invokes in DistributionZoneManager start. Currently it does the meta storage invokes in DistributionZoneManager#createOrRestoreZoneState: # DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to init the default zone. # DistributionZoneManager#restoreTimers. in case when a filter update was handled before DZM stop, but it didn't update data nodes. Futures of these invokes are ignored. So after the start method is completed actually not all start actions are completed. was: There are meta storage invokes in DistributionZoneManager start. Currently it does the meta storage invokes in DistributionZoneManager#createOrRestoreZoneState: # DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to update data nodes. # DistributionZoneManager#restoreTimers. in case when a filter update was handled before DZM stop, but it didn't update data nodes. Futures of these invokes are ignored. So after the start method is completed actually not all start actions are completed. > Meta storage invokes are not completed when DZM start is completed > --- > > Key: IGNITE-20310 > URL: https://issues.apache.org/jira/browse/IGNITE-20310 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > There are meta storage invokes in DistributionZoneManager start. Currently it > does the meta storage invokes in > DistributionZoneManager#createOrRestoreZoneState: > # DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to init > the default zone. > # DistributionZoneManager#restoreTimers. in case when a filter update was > handled before DZM stop, but it didn't update data nodes. > Futures of these invokes are ignored. So after the start method is completed > actually not all start actions are completed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20310) Meta storage invokes are not completed when DZM start is completed
[ https://issues.apache.org/jira/browse/IGNITE-20310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20310: --- Description: There are meta storage invokes in DistributionZoneManager start. Currently it does the meta storage invokes in DistributionZoneManager#createOrRestoreZoneState: # DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to update data nodes. # DistributionZoneManager#restoreTimers. in case when a filter update was handled before DZM stop, but it didn't update data nodes. Futures of these invokes are ignored. So after the start method is completed actually not all start actions are completed. was: There are meta storage invokes in DistributionZoneManager start. Currently it does the meta storage invokes in DistributionZoneManager#createOrRestoreZoneState: # DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to update data nodes. # DistributionZoneManager#restoreTimers. in case when a filter update was handled before DZM stop, but it didn't update data nodes. Futures of these invokes are ignored. So after the start method is completed actually not all start actions are completed. > Meta storage invokes are not completed when DZM start is completed > --- > > Key: IGNITE-20310 > URL: https://issues.apache.org/jira/browse/IGNITE-20310 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > There are meta storage invokes in DistributionZoneManager start. Currently it > does the meta storage invokes in > DistributionZoneManager#createOrRestoreZoneState: > # DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to update > data nodes. > # DistributionZoneManager#restoreTimers. in case when a filter update was > handled before DZM stop, but it didn't update data nodes. > Futures of these invokes are ignored. So after the start method is completed > actually not all start actions are completed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20310) Meta storage invokes are not completed when DZM start is completed
[ https://issues.apache.org/jira/browse/IGNITE-20310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20310: --- Description: There are meta storage invokes in DistributionZoneManager start. Currently it does the meta storage invokes in DistributionZoneManager#createOrRestoreZoneState: # DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to update data nodes. # DistributionZoneManager#restoreTimers. in case when a filter update was handled before DZM stop, but it didn't update data nodes. Futures of these invokes are ignored. So after the start method is completed actually not all start actions are completed. was: There are meta storage invokes in DistributionZoneManager start. Currently it does the meta storage invoke in case when a filter update was handled before DZM stop, but it didn't update data nodes. The future of this invoke is ignored. So after the start method is completed actually not all start actions are completed. > Meta storage invokes are not completed when DZM start is completed > --- > > Key: IGNITE-20310 > URL: https://issues.apache.org/jira/browse/IGNITE-20310 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > There are meta storage invokes in DistributionZoneManager start. Currently it > does the meta storage invokes in > DistributionZoneManager#createOrRestoreZoneState: > # DistributionZoneManager#initDataNodesAndTriggerKeysInMetaStorage to update > data nodes. > # DistributionZoneManager#restoreTimers. in case when a filter update was > handled before DZM stop, but it didn't update data nodes. > Futures of these invokes are ignored. So after the start method is completed > actually not all start actions are completed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (IGNITE-20303) "Raft group on the node is already started" exception when pending and planned assignment changed faster then rebalance
[ https://issues.apache.org/jira/browse/IGNITE-20303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel reassigned IGNITE-20303: -- Assignee: Sergey Uttsel > "Raft group on the node is already started" exception when pending and > planned assignment changed faster then rebalance > --- > > Key: IGNITE-20303 > URL: https://issues.apache.org/jira/browse/IGNITE-20303 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > If many changes of assignment are happened quickly then rebalance does not > have time to be completed for each change. In this case exception is thrown: > {code:java} > 2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_2%tableManager-io-10][WatchProcessor] > Error occurred when processing a watch event > org.apache.ignite.lang.IgniteInternalException: Raft group on the node is > already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer > [consistentId=irdt_ttqr_2, idx=0]]] > at > org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) > ~[main/:?] > at > org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) > ~[main/:?] > at > org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) > ~[main/:?] > at > org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361) > ~[main/:?] > at > org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261) > ~[main/:?] > at > org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) > ~[main/:?] > at > org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259) > ~[main/:?] > at > java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] > at java.lang.Thread.run(Thread.java:834) ~[?:?] > {code} > The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances. > See exception in the test log: > {code:java} > @Test > void testThreeQueuedRebalances() throws Exception { > Node node = getNode(0); > createZone(node, ZONE_NAME, 1, 1); > createTable(node, ZONE_NAME, TABLE_NAME); > assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, > 0).size() == 1, AWAIT_TIMEOUT_MILLIS)); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > waitPartitionAssignmentsSyncedToExpected(0, 2); > checkPartitionNodes(0, 2); > } > {code} > We can fix it by a check if the raft node and the Replica are created before > startPartitionRaftGroupNode and startReplicaWithNewListener in > TableManager#handleChangePendingAssignmentEvent. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-20310) Meta storage invokes are not completed when DZM start is completed
Sergey Uttsel created IGNITE-20310: -- Summary: Meta storage invokes are not completed when DZM start is completed Key: IGNITE-20310 URL: https://issues.apache.org/jira/browse/IGNITE-20310 Project: Ignite Issue Type: Bug Reporter: Sergey Uttsel There are meta storage invokes in DistributionZoneManager start. Currently it does the meta storage invoke in case when a filter update was handled before DZM stop, but it didn't update data nodes. The future of this invoke is ignored. So after the start method is completed actually not all start actions are completed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20303) "Raft group on the node is already started" exception when pending and planned assignment changed faster then rebalance
[ https://issues.apache.org/jira/browse/IGNITE-20303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20303: --- Description: If many changes of assignment are happened quickly then rebalance does not have time to be completed for each change. In this case exception is thrown: {code:java} 2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_2%tableManager-io-10][WatchProcessor] Error occurred when processing a watch event org.apache.ignite.lang.IgniteInternalException: Raft group on the node is already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer [consistentId=irdt_ttqr_2, idx=0]]] at org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) ~[main/:?] at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) ~[main/:?] at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261) ~[main/:?] at org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259) ~[main/:?] at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the test log: {code:java} @Test void testThreeQueuedRebalances() throws Exception { Node node = getNode(0); createZone(node, ZONE_NAME, 1, 1); createTable(node, ZONE_NAME, TABLE_NAME); assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 0).size() == 1, AWAIT_TIMEOUT_MILLIS)); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); waitPartitionAssignmentsSyncedToExpected(0, 2); checkPartitionNodes(0, 2); } {code} We can fix it by a check if the raft node and the Replica are created before startPartitionRaftGroupNode and startReplicaWithNewListener in TableManager#handleChangePendingAssignmentEvent. was: If many changes of assignment are happened quickly then rebalance does not have time to be completed for each change. In this case exception is thrown: {code:java} 2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_2%tableManager-io-10][WatchProcessor] Error occurred when processing a watch event org.apache.ignite.lang.IgniteInternalException: Raft group on the node is already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer [consistentId=irdt_ttqr_2, idx=0]]] at org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) ~[main/:?] at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) ~[main/:?] at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261) ~[main/:?] at org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259) ~[main/:?] at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the test log: {code:java} @Test void testThreeQueuedRebalances() throws Exception { Node node = getNode(0); createZone(node, ZONE_NAME, 1, 1);
[jira] [Updated] (IGNITE-20279) Reordering of altering zone operations
[ https://issues.apache.org/jira/browse/IGNITE-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20279: --- Description: The issue is shown in the test, where several zone change operations occur. In this case, the operation can be reordered and incomplete at the end of the test. There are messages "Received update for replicas number" in the test log in a wrong order. The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances: {code:java} @Test void testThreeQueuedRebalances() throws Exception { Node node = getNode(0); createZone(node, ZONE_NAME, 1, 1); createTable(node, ZONE_NAME, TABLE_NAME); assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 0).size() == 1, AWAIT_TIMEOUT_MILLIS)); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); waitPartitionAssignmentsSyncedToExpected(0, 2); checkPartitionNodes(0, 2); } {code} was: The issue is shown in the test, where several zone change operations occur. In this case, the operation can be reordered and incomplete at the end of the test. There are messages "Received update for replicas number" in the test log in a wrong order. The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the test log: {code:java} @Test void testThreeQueuedRebalances() throws Exception { Node node = getNode(0); createZone(node, ZONE_NAME, 1, 1); createTable(node, ZONE_NAME, TABLE_NAME); assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 0).size() == 1, AWAIT_TIMEOUT_MILLIS)); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); waitPartitionAssignmentsSyncedToExpected(0, 2); checkPartitionNodes(0, 2); } {code} > Reordering of altering zone operations > -- > > Key: IGNITE-20279 > URL: https://issues.apache.org/jira/browse/IGNITE-20279 > Project: Ignite > Issue Type: Bug >Reporter: Vladislav Pyatkov >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > The issue is shown in the test, where several zone change operations occur. > In this case, the operation can be reordered and incomplete at the end of the > test. There are messages "Received update for replicas number" in the test > log in a wrong order. The reproducer based on > ItRebalanceDistributedTest#testThreeQueuedRebalances: > {code:java} > @Test > void testThreeQueuedRebalances() throws Exception { > Node node = getNode(0); > createZone(node, ZONE_NAME, 1, 1); > createTable(node, ZONE_NAME, TABLE_NAME); > assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, > 0).size() == 1, AWAIT_TIMEOUT_MILLIS)); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > waitPartitionAssignmentsSyncedToExpected(0, 2); > checkPartitionNodes(0, 2); > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20279) Reordering of altering zone operations
[ https://issues.apache.org/jira/browse/IGNITE-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20279: --- Reviewer: Mirza Aliev > Reordering of altering zone operations > -- > > Key: IGNITE-20279 > URL: https://issues.apache.org/jira/browse/IGNITE-20279 > Project: Ignite > Issue Type: Bug >Reporter: Vladislav Pyatkov >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > The issue is shown in the test, where several zone change operations occur. > In this case, the operation can be reordered and incomplete at the end of the > test. There are messages "Received update for replicas number" in the test > log in a wrong order. The reproducer based on > ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the > test log: > {code:java} > @Test > void testThreeQueuedRebalances() throws Exception { > Node node = getNode(0); > createZone(node, ZONE_NAME, 1, 1); > createTable(node, ZONE_NAME, TABLE_NAME); > assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, > 0).size() == 1, AWAIT_TIMEOUT_MILLIS)); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > waitPartitionAssignmentsSyncedToExpected(0, 2); > checkPartitionNodes(0, 2); > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20279) Reordering of altering zone operations
[ https://issues.apache.org/jira/browse/IGNITE-20279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20279: --- Description: The issue is shown in the test, where several zone change operations occur. In this case, the operation can be reordered and incomplete at the end of the test. There are messages "Received update for replicas number" in the test log in a wrong order. The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the test log: {code:java} @Test void testThreeQueuedRebalances() throws Exception { Node node = getNode(0); createZone(node, ZONE_NAME, 1, 1); createTable(node, ZONE_NAME, TABLE_NAME); assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 0).size() == 1, AWAIT_TIMEOUT_MILLIS)); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); waitPartitionAssignmentsSyncedToExpected(0, 2); checkPartitionNodes(0, 2); } {code} was: The issue is shown in the test, where several zone change operations occur. On my laptop, the test ({{tRebalanceDistributedTest#testThreeQueuedRebalances}}) fails at least twice on 30 runs. # The first issue that I see is that the test does not wait to execute the last zone change operation: alterZone(node, ZONE_NAME, 2). In this case, the operation can be incomplete at the end of the test. # The second issue is that the next operation may start earlier than the previous one is completed. {noformat} 2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_2%tableManager-io-10][WatchProcessor] Error occurred when processing a watch event org.apache.ignite.lang.IgniteInternalException: Raft group on the node is already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer [consistentId=irdt_ttqr_2, idx=0]]] at org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) ~[main/:?] at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) ~[main/:?] at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261) ~[main/:?] at org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259) ~[main/:?] at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {noformat} > Reordering of altering zone operations > -- > > Key: IGNITE-20279 > URL: https://issues.apache.org/jira/browse/IGNITE-20279 > Project: Ignite > Issue Type: Bug >Reporter: Vladislav Pyatkov >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > The issue is shown in the test, where several zone change operations occur. > In this case, the operation can be reordered and incomplete at the end of the > test. There are messages "Received update for replicas number" in the test > log in a wrong order. The reproducer based on > ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the > test log: > {code:java} > @Test > void testThreeQueuedRebalances() throws Exception { > Node node = getNode(0); > createZone(node, ZONE_NAME, 1, 1); > createTable(node, ZONE_NAME, TABLE_NAME); > assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, > 0).size() == 1, AWAIT_TIMEOUT_MILLIS)); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); > alterZone(node, ZONE_NAME, 2); > alterZone(node, ZONE_NAME, 3); >
[jira] [Created] (IGNITE-20303) "Raft group on the node is already started" exception when pending and planned assignment changed faster then rebalance
Sergey Uttsel created IGNITE-20303: -- Summary: "Raft group on the node is already started" exception when pending and planned assignment changed faster then rebalance Key: IGNITE-20303 URL: https://issues.apache.org/jira/browse/IGNITE-20303 Project: Ignite Issue Type: Bug Reporter: Sergey Uttsel If many changes of assignment are happened quickly then rebalance does not have time to be completed for each change. In this case exception is thrown: {code:java} 2023-08-24T16:58:51,328][ERROR][%irdt_ttqr_2%tableManager-io-10][WatchProcessor] Error occurred when processing a watch event org.apache.ignite.lang.IgniteInternalException: Raft group on the node is already started [nodeId=RaftNodeId [groupId=1_part_0, peer=Peer [consistentId=irdt_ttqr_2, idx=0]]] at org.apache.ignite.internal.raft.Loza.startRaftGroupNodeInternal(Loza.java:342) ~[main/:?] at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:230) ~[main/:?] at org.apache.ignite.internal.raft.Loza.startRaftGroupNode(Loza.java:203) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.startPartitionRaftGroupNode(TableManager.java:2361) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$98(TableManager.java:2261) ~[main/:?] at org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:922) ~[main/:?] at org.apache.ignite.internal.table.distributed.TableManager.lambda$handleChangePendingAssignmentEvent$99(TableManager.java:2259) ~[main/:?] at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:834) ~[?:?] {code} The reproducer based on ItRebalanceDistributedTest#testThreeQueuedRebalances. See exception in the test log: {code:java} @Test void testThreeQueuedRebalances() throws Exception { Node node = getNode(0); createZone(node, ZONE_NAME, 1, 1); createTable(node, ZONE_NAME, TABLE_NAME); assertTrue(waitForCondition(() -> getPartitionClusterNodes(node, 0).size() == 1, AWAIT_TIMEOUT_MILLIS)); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); alterZone(node, ZONE_NAME, 3); alterZone(node, ZONE_NAME, 2); waitPartitionAssignmentsSyncedToExpected(0, 2); checkPartitionNodes(0, 2); } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20054) Flaky tests in ItIgniteDistributionZoneManagerNodeRestartTest
[ https://issues.apache.org/jira/browse/IGNITE-20054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20054: --- Reviewer: Mirza Aliev > Flaky tests in ItIgniteDistributionZoneManagerNodeRestartTest > - > > Key: IGNITE-20054 > URL: https://issues.apache.org/jira/browse/IGNITE-20054 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > *Motivation* > After https://issues.apache.org/jira/browse/IGNITE-19506 was implemented > some tests start to fail. > For example the test testScaleUpTimerIsRestoredAfterRestart use `blockUpdate` > to prevent data nodes updating in the meta storage. Then it check the data > nodes for the zone. But now dataNodes method returns nodes which even have > not written to the meta storage. Because dataNodes use augmentation map. So I > tried to fix this and similar tests by checking data nodes in metastorage, > but after that this tests are flaky. > *Definition of Done* > Fix and enabled tests from ItIgniteDistributionZoneManagerNodeRestartTest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20272) Clean up of DistributionZoneManagerWatchListenerTest
[ https://issues.apache.org/jira/browse/IGNITE-20272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20272: --- Reviewer: Mirza Aliev > Clean up of DistributionZoneManagerWatchListenerTest > > > Key: IGNITE-20272 > URL: https://issues.apache.org/jira/browse/IGNITE-20272 > Project: Ignite > Issue Type: Improvement >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3, tech-debt > Time Spent: 10m > Remaining Estimate: 0h > > h3. *Motivation* > Ticket https://issues.apache.org/jira/browse/IGNITE-18564 was closed because > it is not actual anymore. But actually, the > DistributionZoneManagerWatchListenerTest that was mentioned in this ticket is > not actual. So we need to remove some tests in this class now and later > remove this class when ticket > https://issues.apache.org/jira/browse/IGNITE-19955 is implemented. > DistributionZoneManagerWatchListenerTest has three tests: > # The testStaleWatchEvent is disabled. It fails on an invoke into a > metastorage in which the logical topology and its version are updated. It > fails because the condition of the invoke was incorrect after some changes in > the code. But it is not necessary to use a condition there, I replaced it > with ms.putAll, and the test passed successfully. This test can be removed > because it repeats the > testScaleUpDidNotChangeDataNodesWhenTriggerKeyWasConcurrentlyChanged test. > # testDataNodesUpdatedOnZoneManagerStart is the happy path of the restart, we > already have such tests. Therefore, this test is not needed and can be > removed. > # testStaleVaultRevisionOnZoneManagerStart This test simulates that on the > zones manager restart, the data nodes for the zone will not be updated to the > value corresponding to the logical topology from the vault, because > zonesChangeTriggerKey > metaStorageManager.appliedRevision(). I have not > found an analog of this test. I think that when the DZM restart is updated > then we can update this test and move it to the more appropriate class. > h3. *Definition of done* > # testStaleWatchEvent and testDataNodesUpdatedOnZoneManagerStart are removed. > # testStaleVaultRevisionOnZoneManagerStart marked by IGNITE-19955. > # TODOs with IGNITE-18564 are removed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20272) Clean up of DistributionZoneManagerWatchListenerTest
[ https://issues.apache.org/jira/browse/IGNITE-20272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20272: --- Issue Type: Improvement (was: Bug) > Clean up of DistributionZoneManagerWatchListenerTest > > > Key: IGNITE-20272 > URL: https://issues.apache.org/jira/browse/IGNITE-20272 > Project: Ignite > Issue Type: Improvement >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3, tech-debt > > h3. *Motivation* > Ticket https://issues.apache.org/jira/browse/IGNITE-18564 was closed because > it is not actual anymore. But actually the > DistributionZoneManagerWatchListenerTest that was mentioned in this ticket is > not actual. So we need remove some tests in this class now and later remove > this class when ticket https://issues.apache.org/jira/browse/IGNITE-19955 > will be implemented. > DistributionZoneManagerWatchListenerTest has three tests: > # The testStaleWatchEvent is disabled. It fails on an invoke into a > metastorage in which the logical topology and its version are updated. It > fails because the condition of the invoke was incorrect after some changes in > the code. But it is not necessary to use a condition there, I replaced it > with ms.putAll and the test passed successfully. This test can be removed > because it repeats the > testScaleUpDidNotChangeDataNodesWhenTriggerKeyWasConcurrentlyChanged test. > # testDataNodesUpdatedOnZoneManagerStart is the happy path of restart, we > already have such tests. Therefore, this test is not needed and can be > removed. > # testStaleVaultRevisionOnZoneManagerStart This test simulates that on the > zones manager restart, the data nodes for the zone will not be updated to the > value corresponding to the logical topology from the vault, because > zonesChangeTriggerKey > metaStorageManager.appliedRevision(). I have not > found a analogue of this test. I think that when the DZM restart is updated > then we can update this test and move it to more appropriate class. > h3. *Definition of done* > # testStaleWatchEvent and testDataNodesUpdatedOnZoneManagerStart are removed. > # testStaleVaultRevisionOnZoneManagerStart marked by IGNITE-19955. > # TODOs with IGNITE-18564 are removed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20272) Clean up of DistributionZoneManagerWatchListenerTest
[ https://issues.apache.org/jira/browse/IGNITE-20272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20272: --- Description: h3. *Motivation* Ticket https://issues.apache.org/jira/browse/IGNITE-18564 was closed because it is not actual anymore. But actually the DistributionZoneManagerWatchListenerTest that was mentioned in this ticket is not actual. So we need remove some tests in this class now and later remove this class when ticket https://issues.apache.org/jira/browse/IGNITE-19955 will be implemented. DistributionZoneManagerWatchListenerTest has three tests: # The testStaleWatchEvent is disabled. It fails on an invoke into a metastorage in which the logical topology and its version are updated. It fails because the condition of the invoke was incorrect after some changes in the code. But it is not necessary to use a condition there, I replaced it with ms.putAll and the test passed successfully. This test can be removed because it repeats the testScaleUpDidNotChangeDataNodesWhenTriggerKeyWasConcurrentlyChanged test. # testDataNodesUpdatedOnZoneManagerStart is the happy path of restart, we already have such tests. Therefore, this test is not needed and can be removed. # testStaleVaultRevisionOnZoneManagerStart This test simulates that on the zones manager restart, the data nodes for the zone will not be updated to the value corresponding to the logical topology from the vault, because zonesChangeTriggerKey > metaStorageManager.appliedRevision(). I have not found a analogue of this test. I think that when the DZM restart is updated then we can update this test and move it to more appropriate class. h3. *Definition of done* # testStaleWatchEvent and testDataNodesUpdatedOnZoneManagerStart are removed. # testStaleVaultRevisionOnZoneManagerStart marked by IGNITE-19955. # TODOs with IGNITE-18564 are removed. was: h3. *Motivation* Ticket https://issues.apache.org/jira/browse/IGNITE-18564 was closed because it is not actual anymore. But actually the DistributionZoneManagerWatchListenerTest that was mentioned in this ticket is not actual. So we need remove some tests in this class now and later remove this class when ticket https://issues.apache.org/jira/browse/IGNITE-19955 will be implemented. DistributionZoneManagerWatchListenerTest has three tests: # The testStaleWatchEvent is disabled. It fails on an invoke into a metastorage in which the logical topology and its version are updated. It fails because the condition of the invoke was incorrect after some changes in the code. But it is not necessary to use a condition there, I replaced it with ms.putAll and the test passed successfully. This test can be removed because it repeats the testScaleUpDidNotChangeDataNodesWhenTriggerKeyWasConcurrentlyChanged test. # testDataNodesUpdatedOnZoneManagerStart is the happy path of restart, we already have such tests. Therefore, this test is not needed and can be removed. # testStaleVaultRevisionOnZoneManagerStart This test simulates that on the zones manager restart, the data nodes for the zone will not be updated to the value corresponding to the logical topology from the vault, because zonesChangeTriggerKey > metaStorageManager.appliedRevision(). I have not found a analogue of this test. I think that when the DZM restart is updated then we can update this test and move it to more appropriate class. h3. *Definition of done* # testStaleWatchEvent and testDataNodesUpdatedOnZoneManagerStart are removed. # testStaleVaultRevisionOnZoneManagerStart marked by IGNITE-19955. # TOTOs with IGNITE-18564 are removed. > Clean up of DistributionZoneManagerWatchListenerTest > > > Key: IGNITE-20272 > URL: https://issues.apache.org/jira/browse/IGNITE-20272 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3, tech-debt > > h3. *Motivation* > Ticket https://issues.apache.org/jira/browse/IGNITE-18564 was closed because > it is not actual anymore. But actually the > DistributionZoneManagerWatchListenerTest that was mentioned in this ticket is > not actual. So we need remove some tests in this class now and later remove > this class when ticket https://issues.apache.org/jira/browse/IGNITE-19955 > will be implemented. > DistributionZoneManagerWatchListenerTest has three tests: > # The testStaleWatchEvent is disabled. It fails on an invoke into a > metastorage in which the logical topology and its version are updated. It > fails because the condition of the invoke was incorrect after some changes in > the code. But it is not necessary to use a condition there, I replaced it > with ms.putAll and the test passed successfully. This test can be removed > because it repeats the >
[jira] [Created] (IGNITE-20272) Clean up of DistributionZoneManagerWatchListenerTest
Sergey Uttsel created IGNITE-20272: -- Summary: Clean up of DistributionZoneManagerWatchListenerTest Key: IGNITE-20272 URL: https://issues.apache.org/jira/browse/IGNITE-20272 Project: Ignite Issue Type: Bug Reporter: Sergey Uttsel h3. *Motivation* Ticket https://issues.apache.org/jira/browse/IGNITE-18564 was closed because it is not actual anymore. But actually the DistributionZoneManagerWatchListenerTest that was mentioned in this ticket is not actual. So we need remove some tests in this class now and later remove this class when ticket https://issues.apache.org/jira/browse/IGNITE-19955 will be implemented. DistributionZoneManagerWatchListenerTest has three tests: # The testStaleWatchEvent is disabled. It fails on an invoke into a metastorage in which the logical topology and its version are updated. It fails because the condition of the invoke was incorrect after some changes in the code. But it is not necessary to use a condition there, I replaced it with ms.putAll and the test passed successfully. This test can be removed because it repeats the testScaleUpDidNotChangeDataNodesWhenTriggerKeyWasConcurrentlyChanged test. # testDataNodesUpdatedOnZoneManagerStart is the happy path of restart, we already have such tests. Therefore, this test is not needed and can be removed. # testStaleVaultRevisionOnZoneManagerStart This test simulates that on the zones manager restart, the data nodes for the zone will not be updated to the value corresponding to the logical topology from the vault, because zonesChangeTriggerKey > metaStorageManager.appliedRevision(). I have not found a analogue of this test. I think that when the DZM restart is updated then we can update this test and move it to more appropriate class. h3. *Definition of done* # testStaleWatchEvent and testDataNodesUpdatedOnZoneManagerStart are removed. # testStaleVaultRevisionOnZoneManagerStart marked by IGNITE-19955. # TOTOs with IGNITE-18564 are removed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20050) Clean CausalityDataNodesEngine#zonesVersionedCfg which stores zones' configuration changes
[ https://issues.apache.org/jira/browse/IGNITE-20050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20050: --- Description: *Motivation* CausalityDataNodesEngine#zonesVersionedCfg contains zones' configuration changes. It updates with revision and configuration event on a zone creation, a scale up update, a scale up update and so on. But this map does not remove old values. We need to keep a history of changes to some depth. The easiest way to clear the zonesVersionedCfg is to do it on the meta storage compaction. For this purpose need to create notification about compaction and clear older configurations in the zonesVersionedCfg except of the last one. Another approach is to investigate all current and potential future usages of dataNodes to find out when we can clear the zonesVersionedCfg. There are cases when we can request the date nodes value with the same token several times over an arbitrary period of time. For example before and after the rebalance. But the dataNodes method reads data from the meta storage so we also need to keep dataNodes and other keys in the meta storage. Therefore, this approach also depends on the meta storage compaction. *Definition of Done* # Find out how deep the history of changes needs to be stored. # Remove old values. was: *Motivation* CausalityDataNodesEngine#zonesVersionedCfg contains zones' configuration changes. It updates with revision and configuration event on a zone creation, a scale up update, a scale up update and so on. But this map does not remove old values. We need to keep a history of changes to some depth. *Definition of Done* # Find out how deep the history of changes needs to be stored. # Remove old values. > Clean CausalityDataNodesEngine#zonesVersionedCfg which stores zones' > configuration changes > -- > > Key: IGNITE-20050 > URL: https://issues.apache.org/jira/browse/IGNITE-20050 > Project: Ignite > Issue Type: Improvement >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > *Motivation* > CausalityDataNodesEngine#zonesVersionedCfg contains zones' configuration > changes. It updates with revision and configuration event on a zone creation, > a scale up update, a scale up update and so on. But this map does not remove > old values. We need to keep a history of changes to some depth. > The easiest way to clear the zonesVersionedCfg is to do it on the meta > storage compaction. For this purpose need to create notification about > compaction and clear older configurations in the zonesVersionedCfg except of > the last one. > Another approach is to investigate all current and potential future usages of > dataNodes to find out when we can clear the zonesVersionedCfg. There are > cases when we can request the date nodes value with the same token several > times over an arbitrary period of time. For example before and after the > rebalance. But the dataNodes method reads data from the meta storage so we > also need to keep dataNodes and other keys in the meta storage. Therefore, > this approach also depends on the meta storage compaction. > *Definition of Done* > # Find out how deep the history of changes needs to be stored. > # Remove old values. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-19403) Watch listeners must be deployed after the zone manager starts
[ https://issues.apache.org/jira/browse/IGNITE-19403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel resolved IGNITE-19403. Assignee: Sergey Uttsel Resolution: Fixed > Watch listeners must be deployed after the zone manager starts > -- > > Key: IGNITE-19403 > URL: https://issues.apache.org/jira/browse/IGNITE-19403 > Project: Ignite > Issue Type: Test >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3, tech-debt > > h3. *Motivation* > A method > {{DistributionZonesTestUtil#deployWatchesAndUpdateMetaStorageRevision}} is > used in tests related to a distribution zone manager to increase a meta > storage applied revision before distribution manager starts. The method > breaks invariant: zone manager must be started before > metaStorageManager.deployWatches() is invoked. Need to do proper solution for > increasing the applied revision. > > The first approach to fix it is to invoke methods in this order > > {code:java} > vaultManager.put(new ByteArray("applied_revision"), longToBytes(1)).get(); > metaStorageManager.start(); > distributionZoneManager.start(); > metaStorageManager.deployWatches(); > {code} > > First we put applied_revision. Then start metaStorageManager and > metaStorageManager. Then deploy watches. > The disadvantage of this method is that th ByteArray("applied_revision") is > an internal part of the implementation. > > The second way is a restart all of the components used in the test to > simulate the node restart. In this case, after the zones manager's restart, > the revision will be greater than zero. > h3. *Definition of Done* > The {{deployWatchesAndUpdateMetaStorageRevision}} is replaced by proper > solution. Need to try approach with restart of all components. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20058) NPE in DistributionZoneManagerAlterFilterTest#testAlterFilter
[ https://issues.apache.org/jira/browse/IGNITE-20058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20058: --- Description: *{{Motivation}}* {{DistributionZoneManagerAlterFilterTest.testAlterFilter}} is flaky and with very low failure rate it fails with NPE (1 fail in 1500 runs) {noformat} 2023-07-25 16:48:30:520 +0400 [ERROR][%test%metastorage-watch-executor-0][WatchProcessor] Error occurred when processing a watch event java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateScaleDown$18(DistributionZoneManager.java:737) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source) {noformat} {code:java} 2023-08-01 15:55:40:440 +0300 [INFO][%test%metastorage-watch-executor-1][ConfigurationRegistry] Failed to notify configuration listener java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.updateZoneConfiguration(CausalityDataNodesEngine.java:570) at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.onUpdateFilter(CausalityDataNodesEngine.java:557) at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateFilter$18(DistributionZoneManager.java:774) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source){code} *Implementation Notes* The reason is the wrong start order of the components: # Firstly metastorage watch listeners are deployed. # Then DistributionZoneManager is started. So I change this order to fix the issue. Also I will close https://issues.apache.org/jira/browse/IGNITE-19403 when this ticket will be closed. was: *{{Motivation}}* {{DistributionZoneManagerAlterFilterTest.testAlterFilter}} is flaky and with very low failure rate it fails with NPE (1 fail in 1500 runs) {noformat} 2023-07-25 16:48:30:520 +0400 [ERROR][%test%metastorage-watch-executor-0][WatchProcessor] Error occurred when processing a watch event java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateScaleDown$18(DistributionZoneManager.java:737) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source) {noformat} {code:java} 2023-08-01 15:55:40:440 +0300 [INFO][%test%metastorage-watch-executor-1][ConfigurationRegistry] Failed to notify configuration listener java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.updateZoneConfiguration(CausalityDataNodesEngine.java:570) at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.onUpdateFilter(CausalityDataNodesEngine.java:557) at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateFilter$18(DistributionZoneManager.java:774) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source){code} *Implementation Notes* The reason is the wrong start order of the
[jira] [Updated] (IGNITE-20058) NPE in DistributionZoneManagerAlterFilterTest#testAlterFilter
[ https://issues.apache.org/jira/browse/IGNITE-20058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20058: --- Description: *{{Motivation}}* {{DistributionZoneManagerAlterFilterTest.testAlterFilter}} is flaky and with very low failure rate it fails with NPE (1 fail in 1500 runs) {noformat} 2023-07-25 16:48:30:520 +0400 [ERROR][%test%metastorage-watch-executor-0][WatchProcessor] Error occurred when processing a watch event java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateScaleDown$18(DistributionZoneManager.java:737) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source) {noformat} {code:java} 2023-08-01 15:55:40:440 +0300 [INFO][%test%metastorage-watch-executor-1][ConfigurationRegistry] Failed to notify configuration listener java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.updateZoneConfiguration(CausalityDataNodesEngine.java:570) at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.onUpdateFilter(CausalityDataNodesEngine.java:557) at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateFilter$18(DistributionZoneManager.java:774) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source){code} *Implementation Notes* The reason is the wrong start order of the components: # Firstly metastorage watch listeners are deployed. # Then DistributionZoneManager is started. So I change this order to fix the issue. Also I will close https://issues.apache.org/jira/browse/IGNITE-19403 when this ticket will be closed. was: {{MotivationDistributionZoneManagerAlterFilterTest.testAlterFilter}} is flaky and with very low failure rate it fails with NPE (1 fail in 1500 runs) {noformat} 2023-07-25 16:48:30:520 +0400 [ERROR][%test%metastorage-watch-executor-0][WatchProcessor] Error occurred when processing a watch event java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateScaleDown$18(DistributionZoneManager.java:737) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source) {noformat} {code:java} 2023-08-01 15:55:40:440 +0300 [INFO][%test%metastorage-watch-executor-1][ConfigurationRegistry] Failed to notify configuration listener java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.updateZoneConfiguration(CausalityDataNodesEngine.java:570) at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.onUpdateFilter(CausalityDataNodesEngine.java:557) at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateFilter$18(DistributionZoneManager.java:774) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source){code} > NPE in DistributionZoneManagerAlterFilterTest#testAlterFilter >
[jira] [Updated] (IGNITE-20058) NPE in DistributionZoneManagerAlterFilterTest#testAlterFilter
[ https://issues.apache.org/jira/browse/IGNITE-20058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20058: --- Description: {{MotivationDistributionZoneManagerAlterFilterTest.testAlterFilter}} is flaky and with very low failure rate it fails with NPE (1 fail in 1500 runs) {noformat} 2023-07-25 16:48:30:520 +0400 [ERROR][%test%metastorage-watch-executor-0][WatchProcessor] Error occurred when processing a watch event java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateScaleDown$18(DistributionZoneManager.java:737) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source) {noformat} {code:java} 2023-08-01 15:55:40:440 +0300 [INFO][%test%metastorage-watch-executor-1][ConfigurationRegistry] Failed to notify configuration listener java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.updateZoneConfiguration(CausalityDataNodesEngine.java:570) at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.onUpdateFilter(CausalityDataNodesEngine.java:557) at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateFilter$18(DistributionZoneManager.java:774) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source){code} was: {{DistributionZoneManagerAlterFilterTest.testAlterFilter}} is flaky and with very low failure rate it fails with NPE (1 fail in 1500 runs) {noformat} 2023-07-25 16:48:30:520 +0400 [ERROR][%test%metastorage-watch-executor-0][WatchProcessor] Error occurred when processing a watch event java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateScaleDown$18(DistributionZoneManager.java:737) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source) {noformat} {code:java} 2023-08-01 15:55:40:440 +0300 [INFO][%test%metastorage-watch-executor-1][ConfigurationRegistry] Failed to notify configuration listener java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.updateZoneConfiguration(CausalityDataNodesEngine.java:570) at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.onUpdateFilter(CausalityDataNodesEngine.java:557) at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateFilter$18(DistributionZoneManager.java:774) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source){code} > NPE in DistributionZoneManagerAlterFilterTest#testAlterFilter > - > > Key: IGNITE-20058 > URL: https://issues.apache.org/jira/browse/IGNITE-20058 > Project: Ignite > Issue Type: Bug >Reporter: Mirza Aliev >Assignee: Sergey Uttsel >Priority: Major >
[jira] [Assigned] (IGNITE-20058) NPE in DistributionZoneManagerAlterFilterTest#testAlterFilter
[ https://issues.apache.org/jira/browse/IGNITE-20058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel reassigned IGNITE-20058: -- Assignee: Sergey Uttsel (was: Alexander Lapin) > NPE in DistributionZoneManagerAlterFilterTest#testAlterFilter > - > > Key: IGNITE-20058 > URL: https://issues.apache.org/jira/browse/IGNITE-20058 > Project: Ignite > Issue Type: Bug >Reporter: Mirza Aliev >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > {{DistributionZoneManagerAlterFilterTest.testAlterFilter}} is flaky and with > very low failure rate it fails with NPE (1 fail in 1500 runs) > {noformat} > 2023-07-25 16:48:30:520 +0400 > [ERROR][%test%metastorage-watch-executor-0][WatchProcessor] Error occurred > when processing a watch event > java.lang.NullPointerException > at > org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateScaleDown$18(DistributionZoneManager.java:737) > at > org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) > at > org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) > at > org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) > at > org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown > Source) > {noformat} > {code:java} > 2023-08-01 15:55:40:440 +0300 > [INFO][%test%metastorage-watch-executor-1][ConfigurationRegistry] Failed to > notify configuration listener > java.lang.NullPointerException > at > org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.updateZoneConfiguration(CausalityDataNodesEngine.java:570) > at > org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.onUpdateFilter(CausalityDataNodesEngine.java:557) > at > org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateFilter$18(DistributionZoneManager.java:774) > at > org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) > at > org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) > at > org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) > at > org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown > Source){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20058) NPE in DistributionZoneManagerAlterFilterTest#testAlterFilter
[ https://issues.apache.org/jira/browse/IGNITE-20058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20058: --- Description: {{DistributionZoneManagerAlterFilterTest.testAlterFilter}} is flaky and with very low failure rate it fails with NPE (1 fail in 1500 runs) {noformat} 2023-07-25 16:48:30:520 +0400 [ERROR][%test%metastorage-watch-executor-0][WatchProcessor] Error occurred when processing a watch event java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateScaleDown$18(DistributionZoneManager.java:737) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source) {noformat} {code:java} 2023-08-01 15:55:40:440 +0300 [INFO][%test%metastorage-watch-executor-1][ConfigurationRegistry] Failed to notify configuration listener java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.updateZoneConfiguration(CausalityDataNodesEngine.java:570) at org.apache.ignite.internal.distributionzones.causalitydatanodes.CausalityDataNodesEngine.onUpdateFilter(CausalityDataNodesEngine.java:557) at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateFilter$18(DistributionZoneManager.java:774) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source){code} was: {{DistributionZoneManagerAlterFilterTest.testAlterFilter}} is flaky and with very low failure rate it fails with NPE (1 fail in 1500 runs) {noformat} 2023-07-25 16:48:30:520 +0400 [ERROR][%test%metastorage-watch-executor-0][WatchProcessor] Error occurred when processing a watch event java.lang.NullPointerException at org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateScaleDown$18(DistributionZoneManager.java:737) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown Source) {noformat} > NPE in DistributionZoneManagerAlterFilterTest#testAlterFilter > - > > Key: IGNITE-20058 > URL: https://issues.apache.org/jira/browse/IGNITE-20058 > Project: Ignite > Issue Type: Bug >Reporter: Mirza Aliev >Assignee: Alexander Lapin >Priority: Major > Labels: ignite-3 > > {{DistributionZoneManagerAlterFilterTest.testAlterFilter}} is flaky and with > very low failure rate it fails with NPE (1 fail in 1500 runs) > {noformat} > 2023-07-25 16:48:30:520 +0400 > [ERROR][%test%metastorage-watch-executor-0][WatchProcessor] Error occurred > when processing a watch event > java.lang.NullPointerException > at > org.apache.ignite.internal.distributionzones.DistributionZoneManager.lambda$onUpdateScaleDown$18(DistributionZoneManager.java:737) > at > org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) > at > org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:136) > at > org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitLeafNode(ConfigurationNotifier.java:129) > at > org.apache.ignite.internal.distributionzones.configuration.DistributionZoneNode.traverseChildren(Unknown > Source) > {noformat} > {code:java} > 2023-08-01 15:55:40:440 +0300 >
[jira] [Updated] (IGNITE-20050) Clean CausalityDataNodesEngine#zonesVersionedCfg which stores zones' configuration changes
[ https://issues.apache.org/jira/browse/IGNITE-20050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20050: --- Epic Link: IGNITE-19743 > Clean CausalityDataNodesEngine#zonesVersionedCfg which stores zones' > configuration changes > -- > > Key: IGNITE-20050 > URL: https://issues.apache.org/jira/browse/IGNITE-20050 > Project: Ignite > Issue Type: Improvement >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > *Motivation* > CausalityDataNodesEngine#zonesVersionedCfg contains zones' configuration > changes. It updates with revision and configuration event on a zone creation, > a scale up update, a scale up update and so on. But this map does not remove > old values. We need to keep a history of changes to some depth. > *Definition of Done* > # Find out how deep the history of changes needs to be stored. > # Remove old values. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19506) Use data nodes from DistributionZoneManager with a causality token instead of BaselineManager#nodes
[ https://issues.apache.org/jira/browse/IGNITE-19506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19506: --- Description: h3. *Motivation* Need to use data nodes from DistributionZoneManager instead of BaselineManager#nodes in all places except of in-memory raft (TableManager#calculateAssignments) We need to get data nodes consistently so we need to use revision of configuration events and a meta storage events as causality token. Description of causality data nodes algorithm is attached. h3. *Definition of Done* Implement method DistributionZoneManager#dataNodes to obtaining data nodes from zone manager with causality token. Use this method instead of BaselineManager#nodes. was: h3. *Motivation* Need to use data nodes from DistributionZoneManager instead of BaselineManager#nodes in: # DistributionZoneRebalanceEngine#onUpdateReplicas # TableManager#createAssignmentsSwitchRebalanceListener We need to get data nodes consistently so we need to use revision of configuration events and a meta storage events as causality token. Also need to use VersionedValue to save data nodes with causality token. Description of causality data nodes algorithm is attached. h3. *Definition of Done* DistributionZoneRebalanceEngine#onUpdateReplicas and TableManager#createAssignmentsSwitchRebalanceListener use data nodes from DistributionZoneManager > Use data nodes from DistributionZoneManager with a causality token instead of > BaselineManager#nodes > --- > > Key: IGNITE-19506 > URL: https://issues.apache.org/jira/browse/IGNITE-19506 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Attachments: Causality data nodes.docx > > Time Spent: 5h 20m > Remaining Estimate: 0h > > h3. *Motivation* > Need to use data nodes from DistributionZoneManager instead of > BaselineManager#nodes in all places except of in-memory raft > (TableManager#calculateAssignments) > We need to get data nodes consistently so we need to use revision of > configuration events and a meta storage events as causality token. > Description of causality data nodes algorithm is attached. > h3. *Definition of Done* > Implement method DistributionZoneManager#dataNodes to obtaining data nodes > from zone manager with causality token. > Use this method instead of BaselineManager#nodes. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20054) Flaky tests in ItIgniteDistributionZoneManagerNodeRestartTest
[ https://issues.apache.org/jira/browse/IGNITE-20054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20054: --- Epic Link: IGNITE-19743 > Flaky tests in ItIgniteDistributionZoneManagerNodeRestartTest > - > > Key: IGNITE-20054 > URL: https://issues.apache.org/jira/browse/IGNITE-20054 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > *Motivation* > After https://issues.apache.org/jira/browse/IGNITE-19506 was implemented > some tests start to fail. > For example the test testScaleUpTimerIsRestoredAfterRestart use `blockUpdate` > to prevent data nodes updating in the meta storage. Then it check the data > nodes for the zone. But now dataNodes method returns nodes which even have > not written to the meta storage. Because dataNodes use augmentation map. So I > tried to fix this and similar tests by checking data nodes in metastorage, > but after that this tests are flaky. > *Definition of Done* > Fix and enabled tests from ItIgniteDistributionZoneManagerNodeRestartTest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20053) Empty data nodes are returned by data nodes engine
[ https://issues.apache.org/jira/browse/IGNITE-20053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20053: --- Epic Link: IGNITE-19743 > Empty data nodes are returned by data nodes engine > -- > > Key: IGNITE-20053 > URL: https://issues.apache.org/jira/browse/IGNITE-20053 > Project: Ignite > Issue Type: Bug >Reporter: Denis Chudov >Priority: Major > Labels: ignite-3 > > There is a meta storage key called DISTRIBUTION_ZONES_LOGICAL_TOPOLOGY_KEY > and it is refreshed by topology listener on topology events and stores > logical topology. If the value stored by this key is null, then empty data > nodes are returned from data nodes engine on data nodes calculation for a > distribution zone. As a result, empty assignments are calculated for > partitions, which leads to exception described in IGNITE-19466. > Some integration tests, for example, ItRebalanceDistributedTest are flaky > because of possible problems with value of > DISTRIBUTION_ZONES_LOGICAL_TOPOLOGY_KEY and empty data nodes calculated by > data nodes engine. > Actually, the empty data nodes collection is a wrong result for this case > because the current logical topology is not empty. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20054) Flaky tests in ItIgniteDistributionZoneManagerNodeRestartTest
[ https://issues.apache.org/jira/browse/IGNITE-20054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20054: --- Description: *Motivation* After https://issues.apache.org/jira/browse/IGNITE-19506 was implemented some tests start to fail. For example the test testScaleUpTimerIsRestoredAfterRestart use `blockUpdate` to prevent data nodes updating in the meta storage. Then it check the data nodes for the zone. But now dataNodes method returns nodes which even have not written to the meta storage. Because dataNodes use augmentation map. So I tried to fix this and similar tests by checking data nodes in metastorage, but after that this tests are flaky. *Definition of Done* Fix and enabled tests from ItIgniteDistributionZoneManagerNodeRestartTest. was: *Motivation* After https://issues.apache.org/jira/browse/IGNITE-19506 was implemented some tests start to fail. For example the test testScaleUpTimerIsRestoredAfterRestart use `blockUpdate` to prevent data nodes updating in the meta storage. Then it check the data nodes for the zone. But now dataNodes method returns nodes which even have not written to the meta storage. Because dataNodes use augmentation map. So I tried to fix this and similar tests by checking data nodes in metastorage, but after that this tests are flaky. *Definition of Done* Fix and enabled tests from > Flaky tests in ItIgniteDistributionZoneManagerNodeRestartTest > - > > Key: IGNITE-20054 > URL: https://issues.apache.org/jira/browse/IGNITE-20054 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > *Motivation* > After https://issues.apache.org/jira/browse/IGNITE-19506 was implemented > some tests start to fail. > For example the test testScaleUpTimerIsRestoredAfterRestart use `blockUpdate` > to prevent data nodes updating in the meta storage. Then it check the data > nodes for the zone. But now dataNodes method returns nodes which even have > not written to the meta storage. Because dataNodes use augmentation map. So I > tried to fix this and similar tests by checking data nodes in metastorage, > but after that this tests are flaky. > *Definition of Done* > Fix and enabled tests from ItIgniteDistributionZoneManagerNodeRestartTest. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20054) Flaky tests in ItIgniteDistributionZoneManagerNodeRestartTest
[ https://issues.apache.org/jira/browse/IGNITE-20054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20054: --- Description: *Motivation* After https://issues.apache.org/jira/browse/IGNITE-19506 was implemented some tests start to fail. For example the test testScaleUpTimerIsRestoredAfterRestart use `blockUpdate` to prevent data nodes updating in the meta storage. Then it check the data nodes for the zone. But now dataNodes method returns nodes which even have not written to the meta storage. Because dataNodes use augmentation map. So I tried to fix this and similar tests by checking data nodes in metastorage, but after that this tests are flaky. *Definition of Done* Fix and enabled tests from was: *Motivation* After https://issues.apache.org/jira/browse/IGNITE-19506 was implemented some tests start to fail. For example the test testScaleUpTimerIsRestoredAfterRestart use `blockUpdate` to prevent data nodes updating in the meta storage. Then it check the data nodes for the zone. But now dataNodes method returns nodes which even have not written to the meta storage. Because dataNodes use augmentation map. So I tried to fix this and similar tests by checking data nodes in metastorage, but after that this tests are flaky. *Definition of Done* > Flaky tests in ItIgniteDistributionZoneManagerNodeRestartTest > - > > Key: IGNITE-20054 > URL: https://issues.apache.org/jira/browse/IGNITE-20054 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > *Motivation* > After https://issues.apache.org/jira/browse/IGNITE-19506 was implemented > some tests start to fail. > For example the test testScaleUpTimerIsRestoredAfterRestart use `blockUpdate` > to prevent data nodes updating in the meta storage. Then it check the data > nodes for the zone. But now dataNodes method returns nodes which even have > not written to the meta storage. Because dataNodes use augmentation map. So I > tried to fix this and similar tests by checking data nodes in metastorage, > but after that this tests are flaky. > *Definition of Done* > Fix and enabled tests from -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-20054) Flaky tests in ItIgniteDistributionZoneManagerNodeRestartTest
Sergey Uttsel created IGNITE-20054: -- Summary: Flaky tests in ItIgniteDistributionZoneManagerNodeRestartTest Key: IGNITE-20054 URL: https://issues.apache.org/jira/browse/IGNITE-20054 Project: Ignite Issue Type: Bug Reporter: Sergey Uttsel *Motivation* After https://issues.apache.org/jira/browse/IGNITE-19506 was implemented some tests start to fail. For example the test testScaleUpTimerIsRestoredAfterRestart use `blockUpdate` to prevent data nodes updating in the meta storage. Then it check the data nodes for the zone. But now dataNodes method returns nodes which even have not written to the meta storage. Because dataNodes use augmentation map. So I tried to fix this and similar tests by checking data nodes in metastorage, but after that this tests are flaky. *Definition of Done* -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-20050) Clean CausalityDataNodesEngine#zonesVersionedCfg which stores zones' configuration changes
[ https://issues.apache.org/jira/browse/IGNITE-20050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-20050: --- Summary: Clean CausalityDataNodesEngine#zonesVersionedCfg which stores zones' configuration changes (was: Clean CausalityDataNodesEngine#zonesVersionedCfg) > Clean CausalityDataNodesEngine#zonesVersionedCfg which stores zones' > configuration changes > -- > > Key: IGNITE-20050 > URL: https://issues.apache.org/jira/browse/IGNITE-20050 > Project: Ignite > Issue Type: Improvement >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > *Motivation* > CausalityDataNodesEngine#zonesVersionedCfg contains zones' configuration > changes. It updates with revision and configuration event on a zone creation, > a scale up update, a scale up update and so on. But this map does not remove > old values. We need to keep a history of changes to some depth. > *Definition of Done* > # Find out how deep the history of changes needs to be stored. > # Remove old values. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (IGNITE-20050) Clean CausalityDataNodesEngine#zonesVersionedCfg
Sergey Uttsel created IGNITE-20050: -- Summary: Clean CausalityDataNodesEngine#zonesVersionedCfg Key: IGNITE-20050 URL: https://issues.apache.org/jira/browse/IGNITE-20050 Project: Ignite Issue Type: Improvement Reporter: Sergey Uttsel *Motivation* CausalityDataNodesEngine#zonesVersionedCfg contains zones' configuration changes. It updates with revision and configuration event on a zone creation, a scale up update, a scale up update and so on. But this map does not remove old values. We need to keep a history of changes to some depth. *Definition of Done* # Find out how deep the history of changes needs to be stored. # Remove old values. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-19507) [TC Bot] Doesn't send messages to Slack
[ https://issues.apache.org/jira/browse/IGNITE-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel resolved IGNITE-19507. Resolution: Not A Problem > [TC Bot] Doesn't send messages to Slack > --- > > Key: IGNITE-19507 > URL: https://issues.apache.org/jira/browse/IGNITE-19507 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > > TC Bot doesn't send messages to Slack. For example: > * Open [https://mtcga.gridgain.com/monitoring.html] > * Press "Send" button for Test Slack notification. > * {*}Expected{*}: new message in Slack chat. > * {*}Actual{*}: no messages. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (IGNITE-19507) [TC Bot] Doesn't send messages to Slack
[ https://issues.apache.org/jira/browse/IGNITE-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17739075#comment-17739075 ] Sergey Uttsel edited comment on IGNITE-19507 at 6/30/23 1:16 PM: - The reason is that *TC Bot account* was not added to *tc_green_again* chat. Earlier, it was also not added to the chat, but now it may be necessary for it to work. was (Author: sergey uttsel): The reason is that *mtcga* was not added to *tc_green_again* chat. Earlier, it was also not added to the chat, but now it may be necessary for it to work. > [TC Bot] Doesn't send messages to Slack > --- > > Key: IGNITE-19507 > URL: https://issues.apache.org/jira/browse/IGNITE-19507 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > > TC Bot doesn't send messages to Slack. For example: > * Open [https://mtcga.gridgain.com/monitoring.html] > * Press "Send" button for Test Slack notification. > * {*}Expected{*}: new message in Slack chat. > * {*}Actual{*}: no messages. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-19507) [TC Bot] Doesn't send messages to Slack
[ https://issues.apache.org/jira/browse/IGNITE-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17739075#comment-17739075 ] Sergey Uttsel commented on IGNITE-19507: The reason is that *mtcga* was not added to *tc_green_again* chat. Earlier, it was also not added to the chat, but now it may be necessary for it to work. > [TC Bot] Doesn't send messages to Slack > --- > > Key: IGNITE-19507 > URL: https://issues.apache.org/jira/browse/IGNITE-19507 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > > TC Bot doesn't send messages to Slack. For example: > * Open [https://mtcga.gridgain.com/monitoring.html] > * Press "Send" button for Test Slack notification. > * {*}Expected{*}: new message in Slack chat. > * {*}Actual{*}: no messages. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19783) StripedScheduledExecutorService for DistributionZoneManager#executor
[ https://issues.apache.org/jira/browse/IGNITE-19783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19783: --- Description: h3. *Motivation* In https://issues.apache.org/jira/browse/IGNITE-19736 we set corePoolSize=1 for DistributionZoneManager#executor to ensure that all data nodes calculation tasks per a zone are executed in order of creation. But we need more threads to process this tasks. So we need to create StripedScheduledExecutorService and all tasks for the same zone must be executed in one stripe. The pool to execute the task is defined by a zone id. h3. *Definition of Done* # StripedScheduledExecutorService is created and used instead of single thread executor in DistributionZoneManager. # All tasks for the same zone must be executed in one stripe. h3. *Implementation Notes* I've created a draft StripedScheduledExecutorService in a branch [https://github.com/gridgain/apache-ignite-3/tree/ignite-19783] was: h3. *Motivation* In https://issues.apache.org/jira/browse/IGNITE-19736 we set corePoolSize=1 for DistributionZoneManager#executor to ensure that all data nodes calculation tasks per a zone are executed in order of creation. But we need more threads to process this tasks. So we need to create StripedScheduledExecutorService and all tasks for the same zone must be executed in one stripe. The pool to execute the task is defined by a zone id. h3. *Definition of Done* # StripedScheduledExecutorService is created and used instead of single thread executor in DistributionZoneManager. # All tasks for the same zone must be executed in one stripe. h3. *Implementation Notes* I've created [https://github.com/gridgain/apache-ignite-3/tree/ignite-19783] > StripedScheduledExecutorService for DistributionZoneManager#executor > > > Key: IGNITE-19783 > URL: https://issues.apache.org/jira/browse/IGNITE-19783 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > In https://issues.apache.org/jira/browse/IGNITE-19736 we set corePoolSize=1 > for DistributionZoneManager#executor to ensure that all data nodes > calculation tasks per a zone are executed in order of creation. But we need > more threads to process this tasks. So we need to create > StripedScheduledExecutorService and all tasks for the same zone must be > executed in one stripe. The pool to execute the task is defined by a zone id. > h3. *Definition of Done* > # StripedScheduledExecutorService is created and used instead of single > thread executor in DistributionZoneManager. > # All tasks for the same zone must be executed in one stripe. > h3. *Implementation Notes* > I've created a draft StripedScheduledExecutorService in a branch > [https://github.com/gridgain/apache-ignite-3/tree/ignite-19783] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19783) StripedScheduledExecutorService for DistributionZoneManager#executor
[ https://issues.apache.org/jira/browse/IGNITE-19783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19783: --- Description: h3. *Motivation* In https://issues.apache.org/jira/browse/IGNITE-19736 we set corePoolSize=1 for DistributionZoneManager#executor to ensure that all data nodes calculation tasks per a zone are executed in order of creation. But we need more threads to process this tasks. So we need to create StripedScheduledExecutorService and all tasks for the same zone must be executed in one stripe. The pool to execute the task is defined by a zone id. h3. *Definition of Done* # StripedScheduledExecutorService is created and used instead of single thread executor in DistributionZoneManager. # All tasks for the same zone must be executed in one stripe. h3. *Implementation Notes* I've created [https://github.com/gridgain/apache-ignite-3/tree/ignite-19783] was: h3. *Motivation* In https://issues.apache.org/jira/browse/IGNITE-19736 we set corePoolSize=1 for DistributionZoneManager#executor to ensure that all data nodes calculation tasks per a zone are executed in order of creation. But we need more threads to process this tasks. So we need to create StripedScheduledExecutorService and all tasks for the same zone must be executed in one stripe. The pool to execute the task is defined by a zone id. h3. *Definition of Done* # StripedScheduledExecutorService is created and used instead of single thread executor in DistributionZoneManager. # All tasks for the same zone must be executed in one stripe. > StripedScheduledExecutorService for DistributionZoneManager#executor > > > Key: IGNITE-19783 > URL: https://issues.apache.org/jira/browse/IGNITE-19783 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > In https://issues.apache.org/jira/browse/IGNITE-19736 we set corePoolSize=1 > for DistributionZoneManager#executor to ensure that all data nodes > calculation tasks per a zone are executed in order of creation. But we need > more threads to process this tasks. So we need to create > StripedScheduledExecutorService and all tasks for the same zone must be > executed in one stripe. The pool to execute the task is defined by a zone id. > h3. *Definition of Done* > # StripedScheduledExecutorService is created and used instead of single > thread executor in DistributionZoneManager. > # All tasks for the same zone must be executed in one stripe. > h3. *Implementation Notes* > I've created [https://github.com/gridgain/apache-ignite-3/tree/ignite-19783] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19735) Create implementation of MetaStorageManager for interaction with the local meta storage
[ https://issues.apache.org/jira/browse/IGNITE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19735: --- Description: h3. *Motivation* MetaStorageManager has methods for distributive interaction with meta storage. But now there is added getLocally method to retrieve entries from the local KeyValueStorage. There will be more such methods. So we need: # create implementation of MetaStorageManager interface for interaction with local KeyValueStorage. Named it for example LocalMetaStorageManagerImpl. # Create method `MetaStorageManager local()` in MetaStorageManager. # For MetaStorageManagerImpl it will return LocalMetaStorageManagerImpl # For LocalMetaStorageManagerImpl it will throw UnsupportedOperationException. # Methods in LocalMetaStorageManagerImpl which cannot work will the local meta storage (for example put, invoke) must throw UnsupportedOperationException. # Remove method MetaStorageManager#getLocally(byte[] key, long revLowerBound, long revUpperBound) and create MetaStorageManager#get(byte[] key, long revLowerBound, long revUpperBound). The behavior of this method will be depend implementation (distributed or local). h3. *Definition of Done* # Create implementation of MetaStorageManager for interaction with the local meta storage # MetaStorageManager has a method to get instance of local KeyValueStorage interface. was: h3. *Motivation* MetaStorageManager has methods for distributive interaction with meta storage. But now there is added getLocally method to retrieve entries from the local KeyValueStorage. There will be more such methods. So we need: # create implementation of MetaStorageManager interface for interaction with local KeyValueStorage. Named it for example LocalMetaStorageManagerImpl. # Create method `MetaStorageManager local()` in MetaStorageManager. # For MetaStorageManagerImpl it will return LocalMetaStorageManagerImpl # For LocalMetaStorageManagerImpl it will throw UnsupportedOperationException. # Methods in LocalMetaStorageManagerImpl which cannot work will the local meta storage (for example put, invoke) must throw UnsupportedOperationException. h3. *Definition of Done* # Create implementation of MetaStorageManager for interaction with the local meta storage # MetaStorageManager has a method to get instance of local KeyValueStorage interface. > Create implementation of MetaStorageManager for interaction with the local > meta storage > --- > > Key: IGNITE-19735 > URL: https://issues.apache.org/jira/browse/IGNITE-19735 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > h3. *Motivation* > MetaStorageManager has methods for distributive interaction with meta > storage. But now there is added getLocally method to retrieve entries from > the local KeyValueStorage. There will be more such methods. So we need: > # create implementation of MetaStorageManager interface for interaction with > local KeyValueStorage. Named it for example LocalMetaStorageManagerImpl. > # Create method `MetaStorageManager local()` in MetaStorageManager. > # For MetaStorageManagerImpl it will return LocalMetaStorageManagerImpl > # For LocalMetaStorageManagerImpl it will throw > UnsupportedOperationException. > # Methods in LocalMetaStorageManagerImpl which cannot work will the local > meta storage (for example put, invoke) must throw > UnsupportedOperationException. > # Remove method MetaStorageManager#getLocally(byte[] key, long > revLowerBound, long revUpperBound) and create MetaStorageManager#get(byte[] > key, long revLowerBound, long revUpperBound). The behavior of this method > will be depend implementation (distributed or local). > h3. *Definition of Done* > # Create implementation of MetaStorageManager for interaction with the local > meta storage > # MetaStorageManager has a method to get instance of local KeyValueStorage > interface. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19735) Create implementation of MetaStorageManager for interaction with the local meta storage
[ https://issues.apache.org/jira/browse/IGNITE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19735: --- Description: h3. *Motivation* MetaStorageManager has methods for distributive interaction with meta storage. But now there is added getLocally method to retrieve entries from the local KeyValueStorage. There will be more such methods. So we need: # create implementation of MetaStorageManager interface for interaction with local KeyValueStorage. Named it for example LocalMetaStorageManagerImpl. # Create method `MetaStorageManager local()` in MetaStorageManager. # For MetaStorageManagerImpl it will return LocalMetaStorageManagerImpl # For LocalMetaStorageManagerImpl it will throw UnsupportedOperationException. # Methods in LocalMetaStorageManagerImpl which cannot work will the local meta storage (for example put, invoke) must throw UnsupportedOperationException. h3. *Definition of Done* # Create implementation of MetaStorageManager for interaction with the local meta storage # MetaStorageManager has a method to get instance of local KeyValueStorage interface. was: h3. *Motivation* MetaStorageManager has methods for distributive interaction with meta storage. But now there is added getLocally method to retrieve entries from the local KeyValueStorage. There will be more such methods. So we need: create implementation of MetaStorageManager interface for interaction with local KeyValueStorage. Named it for example LocalMetaStorageManagerImpl. create method `MetaStorageManager local()` in MetaStorageManager. For MetaStorageManagerImpl it will return LocalMetaStorageManagerImpl For LocalMetaStorageManagerImpl it will throw UnsupportedOperationException. Methods in LocalMetaStorageManagerImpl which cannot work will the local meta storage must throw UnsupportedOperationException. h3. *Definition of Done* # Created new interface for interaction with local KeyValueStorage. # MetaStorageManager has a method to get instance of local KeyValueStorage interface. > Create implementation of MetaStorageManager for interaction with the local > meta storage > --- > > Key: IGNITE-19735 > URL: https://issues.apache.org/jira/browse/IGNITE-19735 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > h3. *Motivation* > MetaStorageManager has methods for distributive interaction with meta > storage. But now there is added getLocally method to retrieve entries from > the local KeyValueStorage. There will be more such methods. So we need: > # create implementation of MetaStorageManager interface for interaction with > local KeyValueStorage. Named it for example LocalMetaStorageManagerImpl. > # Create method `MetaStorageManager local()` in MetaStorageManager. > # For MetaStorageManagerImpl it will return LocalMetaStorageManagerImpl > # For LocalMetaStorageManagerImpl it will throw > UnsupportedOperationException. > # Methods in LocalMetaStorageManagerImpl which cannot work will the local > meta storage (for example put, invoke) must throw > UnsupportedOperationException. > h3. *Definition of Done* > # Create implementation of MetaStorageManager for interaction with the local > meta storage > # MetaStorageManager has a method to get instance of local KeyValueStorage > interface. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19735) Create implementation of MetaStorageManager for interaction with the local meta storage
[ https://issues.apache.org/jira/browse/IGNITE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19735: --- Description: h3. *Motivation* MetaStorageManager has methods for distributive interaction with meta storage. But now there is added getLocally method to retrieve entries from the local KeyValueStorage. There will be more such methods. So we need: create implementation of MetaStorageManager interface for interaction with local KeyValueStorage. Named it for example LocalMetaStorageManagerImpl. create method `MetaStorageManager local()` in MetaStorageManager. For MetaStorageManagerImpl it will return LocalMetaStorageManagerImpl For LocalMetaStorageManagerImpl it will throw UnsupportedOperationException. Methods in LocalMetaStorageManagerImpl which cannot work will the local meta storage must throw UnsupportedOperationException. h3. *Definition of Done* # Created new interface for interaction with local KeyValueStorage. # MetaStorageManager has a method to get instance of local KeyValueStorage interface. was: h3. *Motivation* MetaStorageManager has methods for distributive interaction with meta storage. But now there is added getLocally method to retrieve entries from the local KeyValueStorage. There will be more such methods. So we need to create implementation interface for interaction with local KeyValueStorage. h3. *Definition of Done* # Created new interface for interaction with local KeyValueStorage. # MetaStorageManager has a method to get instance of local KeyValueStorage interface. > Create implementation of MetaStorageManager for interaction with the local > meta storage > --- > > Key: IGNITE-19735 > URL: https://issues.apache.org/jira/browse/IGNITE-19735 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > h3. *Motivation* > MetaStorageManager has methods for distributive interaction with meta > storage. But now there is added getLocally method to retrieve entries from > the local KeyValueStorage. There will be more such methods. So we need: > create implementation of MetaStorageManager interface for interaction with > local KeyValueStorage. Named it for example LocalMetaStorageManagerImpl. > create method `MetaStorageManager local()` in MetaStorageManager. > For > MetaStorageManagerImpl it will return LocalMetaStorageManagerImpl > For LocalMetaStorageManagerImpl it will throw UnsupportedOperationException. > Methods in LocalMetaStorageManagerImpl which cannot work will the local meta > storage must throw UnsupportedOperationException. > h3. *Definition of Done* > # Created new interface for interaction with local KeyValueStorage. > # MetaStorageManager has a method to get instance of local KeyValueStorage > interface. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19735) Create implementation of MetaStorageManager for interaction with the local meta storage
[ https://issues.apache.org/jira/browse/IGNITE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19735: --- Summary: Create implementation of MetaStorageManager for interaction with the local meta storage (was: Create proxy of MetaStorageManager for interaction with the local meta storage) > Create implementation of MetaStorageManager for interaction with the local > meta storage > --- > > Key: IGNITE-19735 > URL: https://issues.apache.org/jira/browse/IGNITE-19735 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > h3. *Motivation* > MetaStorageManager has methods for distributive interaction with meta > storage. But now there is added getLocally method to retrieve entries from > the local KeyValueStorage. There will be more such methods. So we need to > create implementation interface for interaction with local KeyValueStorage. > h3. *Definition of Done* > # Created new interface for interaction with local KeyValueStorage. > # MetaStorageManager has a method to get instance of local KeyValueStorage > interface. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19735) Create proxy of MetaStorageManager for interaction with the local meta storage
[ https://issues.apache.org/jira/browse/IGNITE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19735: --- Description: h3. *Motivation* MetaStorageManager has methods for distributive interaction with meta storage. But now there is added getLocally method to retrieve entries from the local KeyValueStorage. There will be more such methods. So we need to create implementation interface for interaction with local KeyValueStorage. h3. *Definition of Done* # Created new interface for interaction with local KeyValueStorage. # MetaStorageManager has a method to get instance of local KeyValueStorage interface. was: h3. *Motivation* MetaStorageManager has methods for distributive interaction with meta storage. But now there is added getLocally method to retrieve entries from the local KeyValueStorage. There will be more such methods. So we need to create separated interface for interaction with local KeyValueStorage. h3. *Definition of Done* # Created new interface for interaction with local KeyValueStorage. # MetaStorageManager has a method to get instance of local KeyValueStorage interface. > Create proxy of MetaStorageManager for interaction with the local meta storage > -- > > Key: IGNITE-19735 > URL: https://issues.apache.org/jira/browse/IGNITE-19735 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > h3. *Motivation* > MetaStorageManager has methods for distributive interaction with meta > storage. But now there is added getLocally method to retrieve entries from > the local KeyValueStorage. There will be more such methods. So we need to > create implementation interface for interaction with local KeyValueStorage. > h3. *Definition of Done* > # Created new interface for interaction with local KeyValueStorage. > # MetaStorageManager has a method to get instance of local KeyValueStorage > interface. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19735) Create implementation of MetaStorageManager for interaction with the local meta storage
[ https://issues.apache.org/jira/browse/IGNITE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19735: --- Description: h3. *Motivation* MetaStorageManager has methods for distributive interaction with meta storage. But now there is added getLocally method to retrieve entries from the local KeyValueStorage. There will be more such methods. So we need to create separated interface for interaction with local KeyValueStorage. h3. *Definition of Done* # Created new interface for interaction with local KeyValueStorage. # MetaStorageManager has a method to get instance of local KeyValueStorage interface. was: h3. *Motivation* MetaStorageManager has methods for distributive interaction with meta storage. But now there is added getEntriesLocally method to retrieve entries from the local KeyValueStorage. There will be more such methods. So we need to create separated interface for interaction with local KeyValueStorage. h3. *Definition of Done* # Created new interface for interaction with local KeyValueStorage. # MetaStorageManager has a method to get instance of local KeyValueStorage interface. > Create implementation of MetaStorageManager for interaction with the local > meta storage > --- > > Key: IGNITE-19735 > URL: https://issues.apache.org/jira/browse/IGNITE-19735 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > h3. *Motivation* > MetaStorageManager has methods for distributive interaction with meta > storage. But now there is added getLocally method to retrieve entries from > the local KeyValueStorage. There will be more such methods. So we need to > create separated interface for interaction with local KeyValueStorage. > h3. *Definition of Done* > # Created new interface for interaction with local KeyValueStorage. > # MetaStorageManager has a method to get instance of local KeyValueStorage > interface. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19735) Create proxy of MetaStorageManager for interaction with the local meta storage
[ https://issues.apache.org/jira/browse/IGNITE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19735: --- Summary: Create proxy of MetaStorageManager for interaction with the local meta storage (was: Create implementation of MetaStorageManager for interaction with the local meta storage) > Create proxy of MetaStorageManager for interaction with the local meta storage > -- > > Key: IGNITE-19735 > URL: https://issues.apache.org/jira/browse/IGNITE-19735 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > h3. *Motivation* > MetaStorageManager has methods for distributive interaction with meta > storage. But now there is added getLocally method to retrieve entries from > the local KeyValueStorage. There will be more such methods. So we need to > create separated interface for interaction with local KeyValueStorage. > h3. *Definition of Done* > # Created new interface for interaction with local KeyValueStorage. > # MetaStorageManager has a method to get instance of local KeyValueStorage > interface. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19735) Create implementation of MetaStorageManager for interaction with the local meta storage
[ https://issues.apache.org/jira/browse/IGNITE-19735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19735: --- Summary: Create implementation of MetaStorageManager for interaction with the local meta storage (was: Create interface for interaction with local KeyValueStorage of the meta storage) > Create implementation of MetaStorageManager for interaction with the local > meta storage > --- > > Key: IGNITE-19735 > URL: https://issues.apache.org/jira/browse/IGNITE-19735 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Assignee: Sergey Uttsel >Priority: Major > Labels: ignite-3 > Time Spent: 0.5h > Remaining Estimate: 0h > > h3. *Motivation* > MetaStorageManager has methods for distributive interaction with meta > storage. But now there is added getEntriesLocally method to retrieve entries > from the local KeyValueStorage. There will be more such methods. So we need > to create separated interface for interaction with local KeyValueStorage. > h3. *Definition of Done* > # Created new interface for interaction with local KeyValueStorage. > # MetaStorageManager has a method to get instance of local KeyValueStorage > interface. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19783) StripedScheduledExecutorService for DistributionZoneManager#executor
[ https://issues.apache.org/jira/browse/IGNITE-19783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19783: --- Description: h3. *Motivation* In https://issues.apache.org/jira/browse/IGNITE-19736 we set corePoolSize=1 for DistributionZoneManager#executor to ensure that all data nodes calculation tasks per a zone are executed in order of creation. But we need more threads to process this tasks. So we need to create StripedScheduledExecutorService and all tasks for the same zone must be executed in one stripe. The pool to execute the task is defined by a zone id. h3. *Definition of Done* # StripedScheduledExecutorService is created and used instead of single thread executor in DistributionZoneManager. # All tasks for the same zone must be executed in one stripe. was: h3. *Motivation* In https://issues.apache.org/jira/browse/IGNITE-19736 we set corePoolSize=1 for DistributionZoneManager#executor to ensure that all data nodes calculation tasks per a zone are executed in order of creation. But we need more threads to process this tasks. So we need to create StripedScheduledExecutorService and all tasks for the same zone must be executed in one stripe. h3. *Definition of Done* # StripedScheduledExecutorService is created and used instead of single thread executor in DistributionZoneManager. # All tasks for the same zone must be executed in one stripe. > StripedScheduledExecutorService for DistributionZoneManager#executor > > > Key: IGNITE-19783 > URL: https://issues.apache.org/jira/browse/IGNITE-19783 > Project: Ignite > Issue Type: Bug >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > In https://issues.apache.org/jira/browse/IGNITE-19736 we set corePoolSize=1 > for DistributionZoneManager#executor to ensure that all data nodes > calculation tasks per a zone are executed in order of creation. But we need > more threads to process this tasks. So we need to create > StripedScheduledExecutorService and all tasks for the same zone must be > executed in one stripe. The pool to execute the task is defined by a zone id. > h3. *Definition of Done* > # StripedScheduledExecutorService is created and used instead of single > thread executor in DistributionZoneManager. > # All tasks for the same zone must be executed in one stripe. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19782) Throw CompactedException if the revision in KeyValueStorage methods is lower than the compaction revision
[ https://issues.apache.org/jira/browse/IGNITE-19782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19782: --- Summary: Throw CompactedException if the revision in KeyValueStorage methods is lower than the compaction revision (was: Create an ability to obtain the compaction revision) > Throw CompactedException if the revision in KeyValueStorage methods is lower > than the compaction revision > - > > Key: IGNITE-19782 > URL: https://issues.apache.org/jira/browse/IGNITE-19782 > Project: Ignite > Issue Type: Improvement >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > Implementations of some methods in RocksDbKeyValueStorage have a revision as > a parameter. For example, methods doGet, doGetAll and other. Need to check > that this revision is not compacted and throw exception if the revision is > lower. For this purpose need to know the last compacted revision. > h3. *Definition of Done* > # Throw CompactedException if the revision in KeyValueStorage methods is > lower than the compaction revision. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19782) Create an ability to obtain the compaction revision
[ https://issues.apache.org/jira/browse/IGNITE-19782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19782: --- Description: h3. *Motivation* Implementations of some methods in RocksDbKeyValueStorage have a revision as a parameter. For example, methods doGet, doGetAll and other. Need to check that this revision is not compacted and throw exception if the revision is lower. For this purpose need to know the last compacted revision. h3. *Definition of Done* # Throw CompactedException if the revision in KeyValueStorage methods is lower than the compaction revision. was: h3. *Motivation* Implementations of some methods in RocksDbKeyValueStorage have a revision as a parameter. For example, methods doGet, doGetAll and other. Need to check that this revision is not compacted and throw exception if the revision is lower. For this purpose need to know the last compacted revision. h3. *Definition of Done* # Created a method for obtaining the compaction revision. # Added assert that the revision in KeyValueStorage methods is higher than the compaction revision. > Create an ability to obtain the compaction revision > --- > > Key: IGNITE-19782 > URL: https://issues.apache.org/jira/browse/IGNITE-19782 > Project: Ignite > Issue Type: Improvement >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > Implementations of some methods in RocksDbKeyValueStorage have a revision as > a parameter. For example, methods doGet, doGetAll and other. Need to check > that this revision is not compacted and throw exception if the revision is > lower. For this purpose need to know the last compacted revision. > h3. *Definition of Done* > # Throw CompactedException if the revision in KeyValueStorage methods is > lower than the compaction revision. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (IGNITE-19782) Create an ability to obtain the compaction revision
[ https://issues.apache.org/jira/browse/IGNITE-19782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Uttsel updated IGNITE-19782: --- Issue Type: Improvement (was: Bug) > Create an ability to obtain the compaction revision > --- > > Key: IGNITE-19782 > URL: https://issues.apache.org/jira/browse/IGNITE-19782 > Project: Ignite > Issue Type: Improvement >Reporter: Sergey Uttsel >Priority: Major > Labels: ignite-3 > > h3. *Motivation* > Implementations of some methods in RocksDbKeyValueStorage have a revision as > a parameter. For example, methods doGet, doGetAll and other. Need to check > that this revision is not compacted and throw exception if the revision is > lower. For this purpose need to know the last compacted revision. > h3. *Definition of Done* > # Created a method for obtaining the compaction revision. > # Added assert that the revision in KeyValueStorage methods is higher than > the compaction revision. -- This message was sent by Atlassian Jira (v8.20.10#820010)