[jira] [Commented] (IGNITE-8241) Docs: Triggering automatic rebalancing if the whole baseline topology is not recovered

2018-09-14 Thread Eugene Miretsky (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615218#comment-16615218
 ] 

Eugene Miretsky commented on IGNITE-8241:
-

[~pgarg] : Where do you run this code? On a separate management node? What if 
that node goes down? 

> Docs: Triggering automatic rebalancing if the whole baseline topology is not 
> recovered
> --
>
> Key: IGNITE-8241
> URL: https://issues.apache.org/jira/browse/IGNITE-8241
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.4
>Reporter: Denis Magda
>Assignee: Prachi Garg
>Priority: Critical
> Fix For: 2.5
>
> Attachments: BaselineWatcher.java
>
>
> The ticket is created as a result of the following discussion:
> http://apache-ignite-developers.2346864.n4.nabble.com/Triggering-rebalancing-on-timeout-or-manually-if-the-baseline-topology-is-not-reassembled-td29299.html
> The rebalancing doesn't happen if one of the nodes goes down, 
> thus, shrinking the baseline topology. It complies with our assumption that 
> the node should be recovered soon and there is no need to waste 
> CPU/memory/networking resources of the cluster shifting the data around. 
> However, there are always edge cases. I was reasonably asked how to trigger 
> the rebalancing within the baseline topology manually or on timeout if: 
> * It's not expected that the failed node would be resurrected in the 
>nearest time and 
> * It's not likely that that node will be replaced by the other one. 
> Until we embedd special facilities in the baseline topology that would 
> consider such situations we can document the following workaround. A user 
> application/tool/script has to subscribe to node_left events and remove the 
> failed node from the baseline topology in some time. Once the node is 
> removed, the baseline topology will be changed, and the rebalancing will be 
> kicked off.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8241) Docs: Triggering automatic rebalancing if the whole baseline topology is not recovered

2018-05-21 Thread Prachi Garg (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16483065#comment-16483065
 ] 

Prachi Garg commented on IGNITE-8241:
-

Reviewed.

> Docs: Triggering automatic rebalancing if the whole baseline topology is not 
> recovered
> --
>
> Key: IGNITE-8241
> URL: https://issues.apache.org/jira/browse/IGNITE-8241
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.4
>Reporter: Denis Magda
>Assignee: Prachi Garg
>Priority: Critical
> Fix For: 2.5
>
> Attachments: BaselineWatcher.java
>
>
> The ticket is created as a result of the following discussion:
> http://apache-ignite-developers.2346864.n4.nabble.com/Triggering-rebalancing-on-timeout-or-manually-if-the-baseline-topology-is-not-reassembled-td29299.html
> The rebalancing doesn't happen if one of the nodes goes down, 
> thus, shrinking the baseline topology. It complies with our assumption that 
> the node should be recovered soon and there is no need to waste 
> CPU/memory/networking resources of the cluster shifting the data around. 
> However, there are always edge cases. I was reasonably asked how to trigger 
> the rebalancing within the baseline topology manually or on timeout if: 
> * It's not expected that the failed node would be resurrected in the 
>nearest time and 
> * It's not likely that that node will be replaced by the other one. 
> Until we embedd special facilities in the baseline topology that would 
> consider such situations we can document the following workaround. A user 
> application/tool/script has to subscribe to node_left events and remove the 
> failed node from the baseline topology in some time. Once the node is 
> removed, the baseline topology will be changed, and the rebalancing will be 
> kicked off.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8241) Docs: Triggering automatic rebalancing if the whole baseline topology is not recovered

2018-05-17 Thread Denis Magda (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16479858#comment-16479858
 ] 

Denis Magda commented on IGNITE-8241:
-

[~pgarg], please review the following section and close the ticket:
https://apacheignite.readme.io/v2.4/docs/cluster-activation#section-triggering-rebalancing-programmatically

> Docs: Triggering automatic rebalancing if the whole baseline topology is not 
> recovered
> --
>
> Key: IGNITE-8241
> URL: https://issues.apache.org/jira/browse/IGNITE-8241
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.4
>Reporter: Denis Magda
>Assignee: Denis Magda
>Priority: Critical
> Fix For: 2.5
>
> Attachments: BaselineWatcher.java
>
>
> The ticket is created as a result of the following discussion:
> http://apache-ignite-developers.2346864.n4.nabble.com/Triggering-rebalancing-on-timeout-or-manually-if-the-baseline-topology-is-not-reassembled-td29299.html
> The rebalancing doesn't happen if one of the nodes goes down, 
> thus, shrinking the baseline topology. It complies with our assumption that 
> the node should be recovered soon and there is no need to waste 
> CPU/memory/networking resources of the cluster shifting the data around. 
> However, there are always edge cases. I was reasonably asked how to trigger 
> the rebalancing within the baseline topology manually or on timeout if: 
> * It's not expected that the failed node would be resurrected in the 
>nearest time and 
> * It's not likely that that node will be replaced by the other one. 
> Until we embedd special facilities in the baseline topology that would 
> consider such situations we can document the following workaround. A user 
> application/tool/script has to subscribe to node_left events and remove the 
> failed node from the baseline topology in some time. Once the node is 
> removed, the baseline topology will be changed, and the rebalancing will be 
> kicked off.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (IGNITE-8241) Docs: Triggering automatic rebalancing if the whole baseline topology is not recovered

2018-04-18 Thread Ivan Rakov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-8241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442919#comment-16442919
 ] 

Ivan Rakov commented on IGNITE-8241:


I propose the following version of BaselineWatcher:
{noformat}
package org.apache.ignite.examples.events;

import java.util.Set;
import java.util.stream.Collectors;
import org.apache.ignite.Ignite;
import org.apache.ignite.cluster.BaselineNode;
import org.apache.ignite.cluster.ClusterNode;
import org.apache.ignite.events.DiscoveryEvent;
import org.apache.ignite.events.EventType;
import org.apache.ignite.internal.IgniteEx;
import org.apache.ignite.internal.processors.timeout.GridTimeoutObjectAdapter;

/**
 * Task that mimics old behavior without baseline topology. Only one task 
should be started for the whole cluster.
 * In case of server node leave/join, BLT will be automatically reset after 
{@link #bltChangeDelayMillis} delay.
 */
public class BaselineWatcher {
/** Ignite. */
private final IgniteEx ignite;

/** BLT change delay millis. */
private final long bltChangeDelayMillis;

/**
 * @param ignite Ignite.
 */
public BaselineWatcher(Ignite ignite, long bltChangeDelayMillis) {
this.ignite = (IgniteEx)ignite;
this.bltChangeDelayMillis = bltChangeDelayMillis;
}

/**
 *
 */
public void start() {
ignite.events().localListen(event -> {
DiscoveryEvent e = (DiscoveryEvent)event;

Set aliveSrvNodes = e.topologyNodes().stream()
.filter(n -> !n.isClient())
.map(ClusterNode::consistentId)
.collect(Collectors.toSet());

Set baseline = 
ignite.cluster().currentBaselineTopology().stream()
.map(BaselineNode::consistentId)
.collect(Collectors.toSet());

final long topVer = e.topologyVersion();

if (!aliveSrvNodes.equals(baseline))
ignite.context().timeout().addTimeoutObject(new 
GridTimeoutObjectAdapter(bltChangeDelayMillis) {
@Override public void onTimeout() {
if (ignite.cluster().topologyVersion() == topVer)
ignite.cluster().setBaselineTopology(topVer);
}
});

return true;
}, EventType.EVT_NODE_FAILED, EventType.EVT_NODE_LEFT, 
EventType.EVT_NODE_JOINED);
}
}
{noformat}

Pros:
1) Baseline will changed only one time in case of several sequential topology 
changes within a short period
2) Baseline will be changed back in case missing node will be finally returned
Simply put, cluster will behave just like in 2.3.

> Docs: Triggering automatic rebalancing if the whole baseline topology is not 
> recovered
> --
>
> Key: IGNITE-8241
> URL: https://issues.apache.org/jira/browse/IGNITE-8241
> Project: Ignite
>  Issue Type: Task
>  Components: documentation
>Affects Versions: 2.4
>Reporter: Denis Magda
>Assignee: Denis Magda
>Priority: Critical
> Fix For: 2.5
>
> Attachments: BaselineWatcher.java
>
>
> The ticket is created as a result of the following discussion:
> http://apache-ignite-developers.2346864.n4.nabble.com/Triggering-rebalancing-on-timeout-or-manually-if-the-baseline-topology-is-not-reassembled-td29299.html
> The rebalancing doesn't happen if one of the nodes goes down, 
> thus, shrinking the baseline topology. It complies with our assumption that 
> the node should be recovered soon and there is no need to waste 
> CPU/memory/networking resources of the cluster shifting the data around. 
> However, there are always edge cases. I was reasonably asked how to trigger 
> the rebalancing within the baseline topology manually or on timeout if: 
> * It's not expected that the failed node would be resurrected in the 
>nearest time and 
> * It's not likely that that node will be replaced by the other one. 
> Until we embedd special facilities in the baseline topology that would 
> consider such situations we can document the following workaround. A user 
> application/tool/script has to subscribe to node_left events and remove the 
> failed node from the baseline topology in some time. Once the node is 
> removed, the baseline topology will be changed, and the rebalancing will be 
> kicked off.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)