Igniters, I have introduced DAT in opposition to BLAT (SAT) because they reflect how Ignite works.
But I actually have concerns about the necessity of such separation. DAT exists only because we don't want to lose any data in in-memory caches. But there are alternatives. Besides BLAT auto-change policies I would pay attention to next approach: - for in-memory caches, affinity would calculate with SAT/BLAT on the first step and because of it collocation would work between in-memory and persistent caches; - on the next step, if there are offline nodes, we would spread their partitions among alive nodes. This would save us from data loss. I don't want to propose any changes until we don't have consensus. On Tue, Apr 24, 2018 at 7:55 PM, Alexey Goncharuk < alexey.goncha...@gmail.com> wrote: > Vladimir, > > Automatic cluster membership changes may be implemented to grow the > topology, but auto-shrinking topology is usually not possible because a > process cannot distinguish between a node shutdown and network > partitioning. If we want to deal with split-brain scenarios as a grown-up > system, we should change the replication strategy within partitions to a > consensus algorithm (I really hope we will). None of the consensus > algorithms (at least known to me - paxos, raft, ZAB) do auto cluster > adjustments based on a internally-detected process failure. I consider > baseline topology as a step towards this model. > > Addressing your second concern, If a node was down for a short period of > time, we should (and we do) rebalance only deltas, which is faster than > erasing the whole node and moving all data from scratch. > > 2018-04-24 19:42 GMT+03:00 Vladimir Ozerov <voze...@gridgain.com>: > > > Ivan, > > > > This reasoning sounds questionable to me. First, separate logic for in > > memory and persistent regions means that we loose collocation between > > persistent and non persistent caches. Second, “data is still on disk” > > assumption might be not valid if node has left due to disk crash, or when > > data is updated on remaining nodes. > > > > вт, 24 апр. 2018 г. в 19:21, Ivan Rakov <ivan.glu...@gmail.com>: > > > > > Stan, > > > > > > I believe it was discussed at the design proposal thread: > > > > > > http://apache-ignite-developers.2346864.n4.nabble. > > com/Cluster-auto-activation-design-proposal-td20295.html > > > > > > The short answer: backup factor decreases if node leaves. In > > > non-persistent mode we have to rebalance data ASAP - otherwise last > node > > > that owns partition may fail and data will be lost forever. > > > This is not necessary if data is persisted to disk storage, that's the > > > reason for Baseline Topology concept. > > > > > > Best Regards, > > > Ivan Rakov > > > > > > On 24.04.2018 18:48, Stanislav Lukyanov wrote: > > > > + for Vladimir's point - adding more complexity may (and likely will) > > be > > > > even more misleading. > > > > > > > > Can we take a step back and discuss why do we need to have different > > > > behavior for persistent and in-memory caches? Can we make in-memory > > > caches > > > > honor baseline instead of special-casing them? > > > > > > > > Thanks, > > > > Stan > > > > > > > > > > > > вт, 24 апр. 2018 г., 18:28 Vladimir Ozerov <voze...@gridgain.com>: > > > > > > > >> Guys, > > > >> > > > >> As a user I definitely do not want to think about BLATs, SATs, DATs, > > > >> whatsoever. I want to query data, iterate over data, send compute > > tasks > > > to > > > >> data. If certain node is outside of BLAT and do not have data, then > > > this is > > > >> not affinity node. Can we just fix affinity logic to take in count > > BLAT > > > >> appropriately? > > > >> > > > >> On Tue, Apr 24, 2018 at 6:12 PM, Ivan Rakov <ivan.glu...@gmail.com> > > > wrote: > > > >> > > > >>> Eduard, > > > >>> > > > >>> Can you please summarize code changes that you are proposing? > > > >>> I agree that BLT is a bit misleading term and DAT/SAT make more > > sense. > > > >>> However, establishing a consensus on v2.4 Baseline Topology > > terminology > > > >>> took a long time and seems like you are going to cause a bit more > > > >>> perturbations. > > > >>> I still don't understand what and how should be changed. Please > > provide > > > >>> summary of upcoming class renamings and changes of existing system > > > parts. > > > >>> > > > >>> Best Regards, > > > >>> Ivan Rakov > > > >>> > > > >>> > > > >>> On 24.04.2018 17:46, Eduard Shangareev wrote: > > > >>> > > > >>>> Hi, Igniters, > > > >>>> > > > >>>> I want to raise a topic about our affinity node definition. > > > >>>> > > > >>>> After adding baseline (affinity) topology (BL(A)T) things start > > being > > > >>>> complicated. > > > >>>> > > > >>>> Plenty of bugs appears: > > > >>>> > > > >>>> IGNITE-8173 > > > >>>> ignite.getOrCreateCache(cacheConfig).iterator() method works > > incorrect > > > >>>> for > > > >>>> replicated cache in case if some data node isn't in baseline > > > >>>> > > > >>>> IGNITE-7628 > > > >>>> SqlQuery hangs indefinitely with additional not registered in > > baseline > > > >>>> node. > > > >>>> > > > >>>> It's because everything relies on concept "affinity node". > > > >>>> And until now it was as simple as a server node which passes node > > > >> filter. > > > >>>> Other words any server node which is not filtered out by node > > filter. > > > >>>> > > > >>>> But node which is not in BL(A)T and which passes node filter would > > be > > > >>>> treated as affinity node. And it's definitely wrong. At least, it > > is a > > > >>>> source of many bugs (I believe there are much more than those 2 > > which > > > I > > > >>>> already have mentioned). > > > >>>> > > > >>>> It's clear that this definition should be changed. > > > >>>> Let's start with a new definition of "Affinity topology". Affinity > > > >>>> topology > > > >>>> is a set of nodes which potentially could keep data. > > > >>>> > > > >>>> If we use knowledge about the current realization we can say that > 1. > > > for > > > >>>> in-memory cache groups it would be all server nodes; > > > >>>> 2. for persistent cache groups it would be BL(A)T. > > > >>>> > > > >>>> I will further use Dynamic Affinity Topology or DAT for 1 > (in-memory > > > >> cache > > > >>>> groups) and Static Affinity Topology or SAT instead BL(A)T, or 2nd > > > >> point. > > > >>>> Denote node filter as f(X), where X is affinity topology. > > > >>>> > > > >>>> Then we can say that node A is affinity node if > > > >>>> A ∈ AT', where AT' = f(AT), where AT is DAT or SAT. > > > >>>> > > > >>>> It worth to mention that AT' should be used to pass to affinity > > > function > > > >>>> of > > > >>>> cache groups. > > > >>>> Also, AT and AT' could change during the time (BL(A)T changes or > > node > > > >>>> joins/disconnections). > > > >>>> > > > >>>> And I don't like fact that usage of DAT or SAT relies on > persistence > > > >>>> settings (Should we make it configurable per cache group?). > > > >>>> > > > >>>> Ok, I have created a ticket to implement this changes and will > start > > > >>>> working on it. > > > >>>> https://issues.apache.org/jira/browse/IGNITE-8380 (Affinity node > > > >>>> calculation doesn't take into account BLT). > > > >>>> > > > >>>> Also, I want to use these definitions (Affinity Topology, Affinity > > > Node, > > > >>>> DAT, SAT) in documentation and java docs. > > > >>>> > > > >>>> Maybe, we also should consider replacing BL(A)T with SAT. > > > >>>> > > > >>>> Thank you for your attention. > > > >>>> > > > >>>> > > > > > > > > >