[
https://issues.apache.org/jira/browse/IGNITE-18087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexander Lapin updated IGNITE-18087:
-------------------------------------
Description:
h3. Motivation
In order to calculate dataNodes for each distribution zone it's required to
have proper actor that will listen to:
* Logical topology events.
* Distribution zone reconfiguration events (adding, updating, removing).
* Distribution zone processing enabling/disabling.
and properly react to such events.
Given ticket is about logicalTopology events only. On start each
DistributionConfiguratoinManager should register onAppeared()/onDisappeared()
events with following logic inside:
* onAppeared
** Locally increment logical topology projection. topology=[] -> node A added
-> topology=[A] -> node B added -> topology=[A, B]
** Locally check whether distribution zone processing enabled, skip further
steps if not. For now flag enabled is always true.
** Iterate over all locally known distribution zones.
** Schedule corresponding scaleUp/Down/Autoadjust times (immediate for now)
along with memorizing in volatile state scheduler start time.
* onDisappeared
** Locally decrement logical topology projection. topology=[A,B] -> node A
removed -> topology=[B] -> node B added -> topology=[]
** Same as for onAppeared
On timer scheduled perform ms invoke, that will prevent concurrent and stale
updates.
h3. Definition of Done
* Distribution zones state (dataNodes) is written to the meta storage as a
result of logical topology events.
h3. Implementation Notes
* Scheduling timers assumes that there's a scheduler tread pool that should be
started and properly stopped on DistributionZoneManager.start()/stop().
* Busy locks.
* On timer event it's required to call an invoke to meta storage that will
update dataNodes per each distributionZone.
** metaStorage.invoke(cas) should allow us to make this in a distributedly
thread-safe manner, meaning that if there are several DistributionZoneManagers
that try to update dataNodes on a same topology event only one will actually do
this.
** metaStorage.invoke(cas) should also prevent stale updates. Let's consider
following example of an ABA problem:
||Topology Events Provider||DistributionZoneManager on Node
A||DistributionZoneManager on Node B||MetaStorage||
| |topologys=[A,B]
zone1.dataNodes=[]|topology=[A,B]
zone1.dataNodes=[]|zone1.dataNodes=[]|
|node C added|onAppeared(C) ->
topology=[A,B,C]
zone1.dataNodes=[C]
ms.invoke(expected=[], set=[C])| |zone1.dataNodes=[C]|
|node C removed|onDisappeared(C) ->
topology=[A,B]
zone1.dataNodes=[]
ms.invoke(expected=[C], set=[])| |zone1.dataNodes=[]|
| | |onAppeared(C) ->
topology=[A,B,C]
zone1.dataNodes=[C]
ms.invoke(expected=[], set=[C])|zone1.dataNodes=[C] - {color:#de350b}*stale
update!*{color}|
| | | | |
was:
h3. Motivation
In order to calculate dataNodes for each distribution zone it's required to
have proper actor that will listen to:
* Logical topology events.
* Distribution zone reconfiguration events (adding, updating, removing).
* Distribution zone processing enabling/disabling.
and properly react to such events.
Given ticket is about logicalTopology events only. On start each
DistributionConfiguratoinManager should register onAppeared()/onDisappeared()
events with following logic inside:
* onAppeared
** Locally increment logical topology projection. topology=[] -> node A added
-> topology=[A] -> node B added -> topology=[A, B]
** Locally check whether distribution zone processing enabled, skip further
steps if not. For now flag enabled is always true.
** Iterate over all locally known distribution zones.
** Schedule corresponding scaleUp/Down/Autoadjust times (immediate for now)
along with memorizing in volatile state scheduler start time.
* onDisappeared
** Locally decrement logical topology projection. topology=[A,B] -> node A
removed -> topology=[B] -> node B added -> topology=[]
** Same as for onAppeared
On timer scheduled perform ms invoke, that will prevent concurrent and stale
updates.
h3. Definition of Done
* Distribution zones state (dataNodes) is written to the meta storage as a
result of logical topology events.
h3. Implementation Notes
* Scheduling timers assumes that there's a scheduler tread pool that should be
started and properly stopped on DistributionZoneManager.start()/stop().
* Busy locks.
* On timer event it's required to call an invoke to meta storage that will
update dataNodes per each distributionZone.
** metaStorage.invoke(cas) should allow us to make this in a distributedly
thread-safe manner, meaning that if there are several DistributionZoneManagers
that try to update dataNodes on a same topology event only one will actually do
this.
** metaStorage.invoke(cas) should also prevent stale updates. Let's consider
following example:
> Populate DistributionZoneManager with listeners to logical topology events
> --------------------------------------------------------------------------
>
> Key: IGNITE-18087
> URL: https://issues.apache.org/jira/browse/IGNITE-18087
> Project: Ignite
> Issue Type: Improvement
> Reporter: Alexander Lapin
> Priority: Major
>
> h3. Motivation
> In order to calculate dataNodes for each distribution zone it's required to
> have proper actor that will listen to:
> * Logical topology events.
> * Distribution zone reconfiguration events (adding, updating, removing).
> * Distribution zone processing enabling/disabling.
> and properly react to such events.
> Given ticket is about logicalTopology events only. On start each
> DistributionConfiguratoinManager should register onAppeared()/onDisappeared()
> events with following logic inside:
> * onAppeared
> ** Locally increment logical topology projection. topology=[] -> node A
> added -> topology=[A] -> node B added -> topology=[A, B]
> ** Locally check whether distribution zone processing enabled, skip further
> steps if not. For now flag enabled is always true.
> ** Iterate over all locally known distribution zones.
> ** Schedule corresponding scaleUp/Down/Autoadjust times (immediate for now)
> along with memorizing in volatile state scheduler start time.
> * onDisappeared
> ** Locally decrement logical topology projection. topology=[A,B] -> node A
> removed -> topology=[B] -> node B added -> topology=[]
> ** Same as for onAppeared
> On timer scheduled perform ms invoke, that will prevent concurrent and stale
> updates.
> h3. Definition of Done
> * Distribution zones state (dataNodes) is written to the meta storage as a
> result of logical topology events.
> h3. Implementation Notes
> * Scheduling timers assumes that there's a scheduler tread pool that should
> be started and properly stopped on DistributionZoneManager.start()/stop().
> * Busy locks.
> * On timer event it's required to call an invoke to meta storage that will
> update dataNodes per each distributionZone.
> ** metaStorage.invoke(cas) should allow us to make this in a distributedly
> thread-safe manner, meaning that if there are several
> DistributionZoneManagers that try to update dataNodes on a same topology
> event only one will actually do this.
> ** metaStorage.invoke(cas) should also prevent stale updates. Let's consider
> following example of an ABA problem:
> ||Topology Events Provider||DistributionZoneManager on Node
> A||DistributionZoneManager on Node B||MetaStorage||
> | |topologys=[A,B]
> zone1.dataNodes=[]|topology=[A,B]
> zone1.dataNodes=[]|zone1.dataNodes=[]|
> |node C added|onAppeared(C) ->
> topology=[A,B,C]
> zone1.dataNodes=[C]
> ms.invoke(expected=[], set=[C])| |zone1.dataNodes=[C]|
> |node C removed|onDisappeared(C) ->
> topology=[A,B]
> zone1.dataNodes=[]
> ms.invoke(expected=[C], set=[])| |zone1.dataNodes=[]|
> | | |onAppeared(C) ->
> topology=[A,B,C]
> zone1.dataNodes=[C]
> ms.invoke(expected=[], set=[C])|zone1.dataNodes=[C] - {color:#de350b}*stale
> update!*{color}|
> | | | | |
--
This message was sent by Atlassian Jira
(v8.20.10#820010)