[
https://issues.apache.org/jira/browse/IGNITE-15364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vyacheslav Koptilin updated IGNITE-15364:
-----------------------------------------
Description:
Looks like the following scenario can break data consistency after rebalancing:
- start and activate the cluster of three server nodes
- create a cache with two backups and fill initial data into it
- stop one server node and upload additional data to the cache in order to
trigger historical rebalance after the node returns to the cluster
- restart the node. make sure that historical rebalancing is started from two
other nodes.
- before rebalancing is completed a new client node should be started and
joined the cluster. this leads to clean up partition update counters on server
nodes, i.e. _GridDhtPartitionTopologyImpl#cntrMap_. ( * )
- historical rebalancing from one node fails.
- in that case, rebalancing is reassigned and starting node tries to rebalance
missed partitions from another node.
unfortunately, update counters for historical rebalance cannot be properly
calculated due to ( * )
An additional issue that was found while debugging:
RebalanceReassignExchangeTask is skipped under some circumstances
{code:java|title=GridCachePartitionExchangeManager.ExchangeWorker#body0}
else if (lastAffChangedVer.after(exchId.topologyVersion())) {
// There is a new exchange which should trigger rebalancing.
// This reassignment request can be skipped.
if (log.isInfoEnabled()) {
log.info("Partitions reassignment request skipped due
to affinity was already changed" +
" [reassignTopVer=" + exchId.topologyVersion() +
", lastAffChangedTopVer=" + lastAffChangedVer +
']');
}
{code}
There could be cases when the current rebalance is not canceled on PME which
updates only minor versions and then triggers _RebalanceReassignExchangeTask _
due to missed partitions on the supplier. After that,
_RebalanceReassignExchangeTask_ is skipped, as the current minor version is
higher than rebalance topology version, which leads to the situation when
instances of missed partitions on demander remain in MOVING state until next
PME that will trigger another rebalance.
was:
Looks like the following scenario can break data consistency after rebalancing:
- start and activate the cluster of three server nodes
- create a cache with two backups and fill initial data into it
- stop one server node and upload additional data to the cache in order to
trigger historical rebalance after the node returns to the cluster
- restart the node. make sure that historical rebalancing is started from two
other nodes.
- before rebalancing is completed a new client node should be started and
joined the cluster. this leads to clean up partition update counters on server
nodes, i.e. _GridDhtPartitionTopologyImpl#cntrMap_. ( * )
- historical rebalancing from one node fails.
- in that case, rebalancing is reassigned and starting node tries to rebalance
missed partitions from another node.
unfortunately, update counters for historical rebalance cannot be properly
calculated due to ( * )
An additional issue that was found while debugging:
RebalanceReassignExchangeTask is skipped under some circumstances
{code:java|title=GridCachePartitionExchangeManager.ExchangeWorker#body0}
else if (lastAffChangedVer.after(exchId.topologyVersion())) {
// There is a new exchange which should trigger rebalancing.
// This reassignment request can be skipped.
if (log.isInfoEnabled()) {
log.info("Partitions reassignment request skipped due
to affinity was already changed" +
" [reassignTopVer=" + exchId.topologyVersion() +
", lastAffChangedTopVer=" + lastAffChangedVer +
']');
}
{code}
There could be cases when the current rebalance is not canceled on PME which
updates only minor versions and then triggers _RebalanceReassignExchangeTask
_due to missed partitions on the supplier. After that,
_RebalanceReassignExchangeTask _is skipped, as the current minor version is
higher than rebalance topology version, which leads to the situation when
instances of missed partitions on demander remain in MOVING state until next
PME that will trigger another rebalance.
> The rebalancing can be broken if historical rebalancing is reassigned after
> the client node joined the cluster.
> ---------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-15364
> URL: https://issues.apache.org/jira/browse/IGNITE-15364
> Project: Ignite
> Issue Type: Bug
> Reporter: Vyacheslav Koptilin
> Assignee: Vyacheslav Koptilin
> Priority: Major
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> Looks like the following scenario can break data consistency after
> rebalancing:
> - start and activate the cluster of three server nodes
> - create a cache with two backups and fill initial data into it
> - stop one server node and upload additional data to the cache in order to
> trigger historical rebalance after the node returns to the cluster
> - restart the node. make sure that historical rebalancing is started from
> two other nodes.
> - before rebalancing is completed a new client node should be started and
> joined the cluster. this leads to clean up partition update counters on
> server nodes, i.e. _GridDhtPartitionTopologyImpl#cntrMap_. ( * )
> - historical rebalancing from one node fails.
> - in that case, rebalancing is reassigned and starting node tries to
> rebalance missed partitions from another node.
> unfortunately, update counters for historical rebalance cannot be properly
> calculated due to ( * )
> An additional issue that was found while debugging:
> RebalanceReassignExchangeTask is skipped under some circumstances
> {code:java|title=GridCachePartitionExchangeManager.ExchangeWorker#body0}
> else if (lastAffChangedVer.after(exchId.topologyVersion())) {
> // There is a new exchange which should trigger rebalancing.
> // This reassignment request can be skipped.
> if (log.isInfoEnabled()) {
> log.info("Partitions reassignment request skipped due
> to affinity was already changed" +
> " [reassignTopVer=" + exchId.topologyVersion() +
> ", lastAffChangedTopVer=" + lastAffChangedVer +
> ']');
> }
> {code}
> There could be cases when the current rebalance is not canceled on PME which
> updates only minor versions and then triggers _RebalanceReassignExchangeTask
> _ due to missed partitions on the supplier. After that,
> _RebalanceReassignExchangeTask_ is skipped, as the current minor version is
> higher than rebalance topology version, which leads to the situation when
> instances of missed partitions on demander remain in MOVING state until next
> PME that will trigger another rebalance.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)