Re: IEP-4, Phase 2. Using BL(A)T for in-memory caches.

Vladimir Ozerov Thu, 26 Apr 2018 10:11:17 -0700

Guys,

I would start with estimating efforts rather than with release numbers. I
am not aware of any pressure or deadlines for AI 2.5. On the other hand
current behavior causes a lot of difficulties to our users and it is better
to focus on how to restore original behavior in the product ASAP. It could
be either AI 2.5 release if we understand that things could be fixed
quickly, or quick AI 2.6 release in a matter of month. What do you think?


As far as public API I would like to highlight the following important
points:
1) Affinity reassignment is critical for cluster stability so it should be
possible to change proposed settings in runtime
2) We should avoid exposing cache groups to API unless we agree on destiny
of thie feature. Hopefully, they would be removed.
3) From our previous experience we know that users typically having hard
times with multiple timeouts. If it possible to keep a single timeout, this
would be ideal
4) Another Ignite's pain point is interface - it is hard to deploy and
manage them in the cluster. For this reason I would avoid custom policy
altogether. Instead, use can listen for discovery events or poll cluster
state form MBeans or metrics and force AT reassignment manually.

Simmarizing all the points I would propose to have the following API:

interface IgniteCluster {
*    boolean affinityTopologyAutoReassignmentEnabled;*
*    long affinityTopologyAutoReassignmentTimeout;*

    void setAffinityTopology(long topVer);
    void setAffinityTopology(Collection<ClusterNode> nodes);
}

That is, only two new porperties are added. In order to force affinity
topology change to latest value user should do the following from any place
(event listener, monitoring thread, MXBean, control.sh, etc.):
setAffinityTopologyVersion(topologyVersion());

Thoughts?

Vladimir.

On Thu, Apr 26, 2018 at 7:27 PM, Eduard Shangareev <
[email protected]> wrote:

> Igniters,
>
> Ok, I want to share my thoughts about "affinity topology (AT) changing
> policies".
>
>
> There would be three major option:
> -auto;
> -manual;
> -custom.
>
> 1. Automatic change.
> A user could set timeouts for:
> a. change AT on any topology change after some timeout (setATChangeTimeout
> in seconds);
> b. change AT on node left after some timeout (setATChangeOnNodeLeftTimeout
> in seconds);
> c. change AT on node join after some timeout (setATChangeOnNodeJoinTimeout
> in seconds).
>
> b and c are more specific, so they would override a.
>
> Also, I want to introduce a mechanism of merging AT changes, which would be
> turned on by default.
> Other words, if we reached timeout than we would change AT to current
> topology, not that one which was on timeout schedule.
>
> 2. Manual change.
>
> Current behavior. A user change AT himself by console tools or web console.
>
> 3. Custom.
>
> We would give the option to set own realization of changing policy (class
> name in config).
> We should pass as incoming parameters:
> - current topology (collection of cluster nodes);
> - current AT (affinity topology);
> - map of GroupId to minimal alive backup number;
> - list of configuration (1.a, 1.b, 1.c);
> - scheduler.
>
> Plus to these configurations, I propose orthogonal option.
> 4. Emergency affinity topology change.
> It would change AT even MANUAL option is set if at least one cache group
> backup factor goes below  or equal chosen one (by default 0).
> So, if we came to situation when after node left there was only primary
> partion (without backups) for some cache group we would change AT
> immediately.
>
>
> Thank you for your attention.
>
>
> On Thu, Apr 26, 2018 at 6:57 PM, Eduard Shangareev <
> [email protected]> wrote:
>
> > Dmitriy,
> >
> > I also think that we should think about 2.6 as the target.
> >
> >
> > On Thu, Apr 26, 2018 at 3:27 PM, Alexey Goncharuk <
> > [email protected]> wrote:
> >
> >> Dmitriy,
> >>
> >> I doubt we will be able to fit this in 2.5 given that we did not even
> >> agree
> >> on the policy interface. Forcing in-memory caches to use baseline
> topology
> >> will be an easy technical fix, however, we will need to update and
> >> probably
> >> fix lots of failover tests, add new ones.
> >>
> >> I think it makes sense to target this change to 2.6.
> >>
> >> 2018-04-25 22:25 GMT+03:00 Ilya Lantukh <[email protected]>:
> >>
> >> > Eduard,
> >> >
> >> > I'm not sure I understand what you mean by "policy". Is it an
> interface
> >> > that will have a few default implementations and user will be able to
> >> > create his own one? If so, could you please write an example of such
> >> > interface (how you see it) and how and when it's methods will be
> >> invoked.
> >> >
> >> > On Wed, Apr 25, 2018 at 10:10 PM, Eduard Shangareev <
> >> > [email protected]> wrote:
> >> >
> >> > > Igniters,
> >> > > I have described the issue with current approach in "New definition
> >> for
> >> > > affinity node (issues with baseline)" topic[1].
> >> > >
> >> > > Now we have 2 different affinity topology (one for in-memory,
> another
> >> for
> >> > > persistent caches).
> >> > >
> >> > > It causes problems:
> >> > > - we lose (in general) co-location between different caches;
> >> > > - we can't avoid PME when non-BLAT node joins cluster;
> >> > > - implementation should consider 2 different approaches to affinity
> >> > > calculation.
> >> > >
> >> > > So, I suggest unifying behavior of in-memory and persistent caches.
> >> > > They should all use BLAT.
> >> > >
> >> > > Their behaviors were different because we couldn't guarantee the
> >> safety
> >> > of
> >> > > in-memory data.
> >> > > It should be fixed by a new mechanism of BLAT changing policy which
> >> was
> >> > > already discussed there - "Triggering rebalancing on timeout or
> >> manually
> >> > if
> >> > > the baseline topology is not reassembled" [2].
> >> > >
> >> > > And we should have a policy by default which similar to current one
> >> > > (add nodes, remove nodes automatically but after some reasonable
> delay
> >> > > [seconds]).
> >> > >
> >> > > After this change, we could stop using the term 'BLAT', Basline and
> so
> >> > on.
> >> > > Because there would not be an alternative. So, it would be only one
> >> > > possible Affinity Topology.
> >> > >
> >> > >
> >> > > [1]
> >> > > http://apache-ignite-developers.2346864.n4.nabble.
> >> > com/New-definition-for-
> >> > > affinity-node-issues-with-baseline-td29868.html
> >> > > [2]
> >> > > http://apache-ignite-developers.2346864.n4.nabble.
> >> > > com/Triggering-rebalancing-on-timeout-or-manually-if-the-
> >> > > baseline-topology-is-not-reassembled-td29299.html#none
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > Best regards,
> >> > Ilya
> >> >
> >>
> >
> >
>

Re: IEP-4, Phase 2. Using BL(A)T for in-memory caches.

Reply via email to