Eduard,

+1 to your proposed API for configuring Affinity Topology change policies.
Obviously we should use "auto" as default behavior. I believe, automatic rebalancing is expected and more convenient for majority of users.

Best Regards,
Ivan Rakov

On 26.04.2018 19:27, Eduard Shangareev wrote:
Igniters,

Ok, I want to share my thoughts about "affinity topology (AT) changing
policies".


There would be three major option:
-auto;
-manual;
-custom.

1. Automatic change.
A user could set timeouts for:
a. change AT on any topology change after some timeout (setATChangeTimeout
in seconds);
b. change AT on node left after some timeout (setATChangeOnNodeLeftTimeout
in seconds);
c. change AT on node join after some timeout (setATChangeOnNodeJoinTimeout
in seconds).

b and c are more specific, so they would override a.

Also, I want to introduce a mechanism of merging AT changes, which would be
turned on by default.
Other words, if we reached timeout than we would change AT to current
topology, not that one which was on timeout schedule.

2. Manual change.

Current behavior. A user change AT himself by console tools or web console.

3. Custom.

We would give the option to set own realization of changing policy (class
name in config).
We should pass as incoming parameters:
- current topology (collection of cluster nodes);
- current AT (affinity topology);
- map of GroupId to minimal alive backup number;
- list of configuration (1.a, 1.b, 1.c);
- scheduler.

Plus to these configurations, I propose orthogonal option.
4. Emergency affinity topology change.
It would change AT even MANUAL option is set if at least one cache group
backup factor goes below  or equal chosen one (by default 0).
So, if we came to situation when after node left there was only primary
partion (without backups) for some cache group we would change AT
immediately.


Thank you for your attention.


On Thu, Apr 26, 2018 at 6:57 PM, Eduard Shangareev <
eduard.shangar...@gmail.com> wrote:

Dmitriy,

I also think that we should think about 2.6 as the target.


On Thu, Apr 26, 2018 at 3:27 PM, Alexey Goncharuk <
alexey.goncha...@gmail.com> wrote:

Dmitriy,

I doubt we will be able to fit this in 2.5 given that we did not even
agree
on the policy interface. Forcing in-memory caches to use baseline topology
will be an easy technical fix, however, we will need to update and
probably
fix lots of failover tests, add new ones.

I think it makes sense to target this change to 2.6.

2018-04-25 22:25 GMT+03:00 Ilya Lantukh <ilant...@gridgain.com>:

Eduard,

I'm not sure I understand what you mean by "policy". Is it an interface
that will have a few default implementations and user will be able to
create his own one? If so, could you please write an example of such
interface (how you see it) and how and when it's methods will be
invoked.
On Wed, Apr 25, 2018 at 10:10 PM, Eduard Shangareev <
eduard.shangar...@gmail.com> wrote:

Igniters,
I have described the issue with current approach in "New definition
for
affinity node (issues with baseline)" topic[1].

Now we have 2 different affinity topology (one for in-memory, another
for
persistent caches).

It causes problems:
- we lose (in general) co-location between different caches;
- we can't avoid PME when non-BLAT node joins cluster;
- implementation should consider 2 different approaches to affinity
calculation.

So, I suggest unifying behavior of in-memory and persistent caches.
They should all use BLAT.

Their behaviors were different because we couldn't guarantee the
safety
of
in-memory data.
It should be fixed by a new mechanism of BLAT changing policy which
was
already discussed there - "Triggering rebalancing on timeout or
manually
if
the baseline topology is not reassembled" [2].

And we should have a policy by default which similar to current one
(add nodes, remove nodes automatically but after some reasonable delay
[seconds]).

After this change, we could stop using the term 'BLAT', Basline and so
on.
Because there would not be an alternative. So, it would be only one
possible Affinity Topology.


[1]
http://apache-ignite-developers.2346864.n4.nabble.
com/New-definition-for-
affinity-node-issues-with-baseline-td29868.html
[2]
http://apache-ignite-developers.2346864.n4.nabble.
com/Triggering-rebalancing-on-timeout-or-manually-if-the-
baseline-topology-is-not-reassembled-td29299.html#none



--
Best regards,
Ilya



Reply via email to