alex-plekhanov commented on a change in pull request #8668: URL: https://github.com/apache/ignite/pull/8668#discussion_r565204716
########## File path: modules/core/src/main/java/org/apache/ignite/cache/affinity/rendezvous/ClusterNodeAttributeColocatedBackupFilter.java ########## @@ -0,0 +1,114 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.ignite.cache.affinity.rendezvous; + +import java.util.List; +import java.util.Objects; +import org.apache.ignite.cluster.ClusterNode; +import org.apache.ignite.lang.IgniteBiPredicate; + +/** + * This class can be used as a {@link RendezvousAffinityFunction#affinityBackupFilter } to create + * cache templates in Spring that force each partition's primary and backup to be co-located on nodes with the same + * attribute value. + * <p> + * + * Partition copies co-location can be helpful to group nodes into cells when fixed baseline topology is used. If all + * copies of each partition are located inside only one cell, in case of {@code backup + 1} nodes leave the cluster + * there will be data lost only if all leaving nodes belong to the same cell. Without partition copies co-location + * within a cell, most probably there will be data lost if any {@code backup + 1} nodes leave the cluster. + * + * Note: Baseline topology change can lead to inter-cell partitions migration, i.e. rebalance can affect all copies + * of some partitions even if only one node is changed in the baseline topology. + * <p> + * + * This implementation will discard backups rather than place copies on nodes with different attribute values. This + * avoids trying to cram more data onto remaining nodes when some have failed. + * <p> + * A node attribute to compare is provided on construction. + * + * Note: All cluster nodes, on startup, automatically register all the environment and system properties as node + * attributes. + * + * Note: Node attributes persisted in baseline topology at the time of baseline topology change. If the co-location + * attribute of some node was updated, but the baseline topology wasn't changed, the outdated attribute value can be + * used by the backup filter when this node left the cluster. To avoid this, the baseline topology should be updated + * after changing the co-location attribute. + * <p> + * This class is constructed with a node attribute name, and a candidate node will be rejected if previously selected + * nodes for a partition have a different value for attribute on the candidate node. + * </pre> + * <h2 class="header">Spring Example</h2> + * Create a partitioned cache template plate with 1 backup, where the backup will be placed in the same cell + * as the primary. Note: This example requires that the environment variable "CELL" be set appropriately on + * each node via some means external to Ignite. + * <pre name="code" class="xml"> + * <property name="cacheConfiguration"> + * <list> + * <bean id="cache-template-bean" abstract="true" class="org.apache.ignite.configuration.CacheConfiguration"> + * <property name="name" value="JobcaseDefaultCacheConfig*"/> + * <property name="cacheMode" value="PARTITIONED" /> + * <property name="backups" value="1" /> + * <property name="affinity"> + * <bean class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction"> + * <property name="affinityBackupFilter"> + * <bean class="org.apache.ignite.cache.affinity.rendezvous.ClusterNodeAttributeColocatedBackupFilter"> + * <!-- Backups must go to the same CELL as primary --> + * <constructor-arg value="CELL" /> + * </bean> + * </property> + * </bean> + * </property> + * </bean> + * </list> + * </property> + * </pre> + * <p> + */ +public class ClusterNodeAttributeColocatedBackupFilter implements IgniteBiPredicate<ClusterNode, List<ClusterNode>> { + /** */ + private static final long serialVersionUID = 1L; + + /** Attribute name. */ + private final String attrName; + + /** + * @param attrName The attribute name for the attribute to compare. + */ + public ClusterNodeAttributeColocatedBackupFilter(String attrName) { + this.attrName = attrName; + } + + /** + * Defines a predicate which returns {@code true} if a node is acceptable for a backup + * or {@code false} otherwise. An acceptable node is one where its attribute value + * is exact match with previously selected nodes. If an attribute does not Review comment: Agreed, with null attribute value data loss can happen in some circumstances. But null attribute value it's only one special case of misconfiguration, there are much more such cases exist when misconfiguration can lead to data loss (wrong attribute value for example). Also, I think we should not shut down the cluster at some unexpected moment due to misconfigured node joined some time ago. For example, the node with the empty attribute can join during maintenance time and nothing will happen. Later, during high load time coordinator can leave the cluster and at this time all cluster nodes will be stopped by the failure handler one by one. If we want to handle only a special case of "null attribute value" I have another proposal: allow collocation on null attribute value nodes and any other attribute value nodes, for example, like this: ``` String primaryAttrVal = previouslySelected.get(0).attribute(attrName); String candidateAttrVal = candidate.attribute(attrName); return (primaryAttrVal == null || candidateAttrVal == null) || primaryAttrVal.equals(candidateAttrVal); ``` In this case, there will be no data loss and the cluster will be available. What do you think? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
