Re: Triggering rebalancing on timeout or manually if the baseline topology is not reassembled

Ivan Rakov Thu, 12 Apr 2018 14:22:55 -0700

Guys,

I also heard complaints about absence of option to automatically changebaseline topology. They absolutely make sense.What Pavel suggested will work as a workaround. I think, in futurereleases we should give user an option to enable a similar behavior viaIgnite Configuration.It may be called "Baseline Topology change policy". I see it asrule-based language, which allows to specify conditions of BLT changeusing several parameters - timeout and minimum allowed number ofpartition copies left (maybe this option should be provided also onper-cache-group level). Policy can also specify conditions for includingnew nodes in BLT if they are present - including node attributes filtersand so on.


What do you think?

Best Regards,
Ivan Rakov

On 12.04.2018 19:41, Pavel Kovalenko wrote:

Denis,

It's just one of the ways to implement it. We also can subscribe on node
join / fail events to properly track downtime of a node.

2018-04-12 19:38 GMT+03:00 Pavel Kovalenko <[email protected]>:

Denis,

Using our API we can implement this task as follows:
Do each minute:
1) Get all alive server nodes consistent ids =>
ignite().context().discovery().aliveServerNodes() => mapToConsistentIds().
2) Get current baseline topology => ignite().cluster().
currentBaselineTopology()
3) For each node in baseline and not in alive server nodes check timeout
for this node.
4) If timeout is reached remove node from baseline
5) If baseline is changed set new baseline => ignite().cluster().
setNewBaseline()


2018-04-12 2:18 GMT+03:00 Denis Magda <[email protected]>:

Pavel, Val,

So, it means that the rebalancing will be initiated only after an
administrator remove the failed node from the topology, right?

Next, imagine that you are that IT administrator who has to automate the
rebalancing activation if the node failed and not recovered within 1
minute. What would you do and what Ignite provides to fulfill the task?

--
Denis

On Wed, Apr 11, 2018 at 1:01 PM, Pavel Kovalenko <[email protected]>
wrote:

Denis,

In case of incomplete baseline topology IgniteCache.rebalance() will do
nothing, because this event doesn't trigger partitions exchange or

affinity

change, so states of existing partitions are hold.

2018-04-11 22:27 GMT+03:00 Valentin Kulichenko <
[email protected]>:

Denis,

In my understanding, in this case you should remove node from BLT and

that

will trigger the rebalancing, no?

-Val

On Wed, Apr 11, 2018 at 12:23 PM, Denis Magda <[email protected]>

wrote:

Igniters,

As we know the rebalancing doesn't happen if one of the nodes goes

down,

thus, shrinking the baseline topology. It complies with our

assumption

that

the node should be recovered soon and there is no need to waste
CPU/memory/networking resources of the cluster shifting the data

around.

However, there are always edge cases. I was reasonably asked how to

trigger

the rebalancing within the baseline topology manually or on timeout

if:

    - It's not expected that the failed node would be resurrected in

the

    nearest time and
    - It's not likely that that node will be replaced by the other

one.

The question. If I call IgniteCache.rebalance() or configure
CacheConfiguration.rebalanceTimeout will the rebalancing be fired

within

the baseline topology?

--
Denis

Re: Triggering rebalancing on timeout or manually if the baseline topology is not reassembled

Reply via email to