RE: Determine Node Health?

2018-10-11 Thread Stanislav Lukyanov
You’re creating a new cache on each heath check call and never 
destroy them – of course, that leads to a memory leak; it’s also awful for the 
performance.

Don’t create a new cache each time. If you really want to check that cache 
operations work, 
use the same one every time.

Thanks,
Stan


From: Jason.G
Sent: 10 октября 2018 г. 8:49
To: user@ignite.apache.org
Subject: Re: Determine Node Health?

Hi vgrigorev,

I used your suggestion to do health check for each node. But I got memory
leak issue and exit with OOM error:  java heap space.

Below is my example code: 

// I create one bean to collect what I want info which include IP, hostname,
createtime and then return json string.
IgniteHealthCheckEntity healthCheck = new IgniteHealthCheckEntity();
ClusterNode node = ignite.cluster().localNode();
List adresses = (List)node.addresses();
String ip = adresses.get(0);

List hostnames = (List)node.hostNames();
String hostname = hostnames.get(0);

healthCheck.setServerIp(ip);
healthCheck.setStatus(0);
healthCheck.setServerHostname(hostname);
healthCheck.setMonitorTime(monitorTime);
healthCheck.setClientIp(clientIp);
String cacheName = "test_monitor_" + ipStr + "_"+ new Date().getTime();

IgniteCache putCache = ignite.createCache(cacheName);
putCache.put("test", "test");
String value = putCache.get("test");
if(!"test".equals(value)) {
message = "Ignite ("+ ip  +") " + "get/put value failed";
healthCheck.setMessage(message);
return JSONObject.fromObject(healthCheck).toString();
}else {
message = "OKOKOK";
healthCheck.setMessage(message);
return JSONObject.fromObject(healthCheck).toString(); 
}





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/



Re: Determine Node Health?

2018-10-09 Thread Jason.G
Hi vgrigorev,

I used your suggestion to do health check for each node. But I got memory
leak issue and exit with OOM error:  java heap space.

Below is my example code: 

// I create one bean to collect what I want info which include IP, hostname,
createtime and then return json string.
IgniteHealthCheckEntity healthCheck = new IgniteHealthCheckEntity();
ClusterNode node = ignite.cluster().localNode();
List adresses = (List)node.addresses();
String ip = adresses.get(0);

List hostnames = (List)node.hostNames();
String hostname = hostnames.get(0);

healthCheck.setServerIp(ip);
healthCheck.setStatus(0);
healthCheck.setServerHostname(hostname);
healthCheck.setMonitorTime(monitorTime);
healthCheck.setClientIp(clientIp);
String cacheName = "test_monitor_" + ipStr + "_"+ new Date().getTime();

IgniteCache putCache = ignite.createCache(cacheName);
putCache.put("test", "test");
String value = putCache.get("test");
if(!"test".equals(value)) {
message = "Ignite ("+ ip  +") " + "get/put value failed";
healthCheck.setMessage(message);
return JSONObject.fromObject(healthCheck).toString();
}else {
message = "OKOKOK";
healthCheck.setMessage(message);
return JSONObject.fromObject(healthCheck).toString(); 
}





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Determine Node Health?

2018-09-05 Thread Chris Berry
Thanks to you both!




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Determine Node Health?

2018-09-05 Thread vgrigorev
I would propose to make periodic call to all nodes one by one
with some simple remote function.
Measure time or each node responce, and if it is low for some node according
to your needs, avoid using this node for some period. 

How to choose nodes for call, single or many:

IgniteCompute compute = ignite.compute(ignite.cluster().forNodeIds( 
set UUID here ));
final Collection mapKexs = compute.broadcast(
new IgniteCallable() {
// Inject Ignite instance.
@IgniteInstanceResource
private Ignite ignite;

@Override
public String call() throws Exception {
log.debug(" DIAGNOSTICS: node is `{}`",
ignite.cluster().localNode().consistentId() , url);

return ignite.cluster().localNode().consistentId() ;
}
});



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/


Re: Determine Node Health?

2018-09-05 Thread Alex Plehanov
Hello Chris,

There is no such metric as "node is healthy" now, but each node provides a
lot of low-level metrics such as CPU usage, memory usage, jobs
execution/waiting time etc, which you can combine and define your own
criteria of "healthy node". These metrics available cluster-wide and
contains information for each node, see ClusterGroup#metrics(),
ClusterNode#metrics() methods.


ср, 5 сент. 2018 г. в 0:39, Chris Berry :

> Hi,
>
> We are using an Ignite ComputeGrid, and it is mostly working nicely.
>
> Recently we had a Node with "Noisy Neighbors" in AWS that wrecked havoc in
> our ComputeGrid.
> Even though that Node was quite slow, it was never removed from the
> map/reduce – slowing down all computes.
>
> We have already built a system that allows us to add/subtract Nodes to the
> ComputeGrid based on when they are actually “ready to compute”,
> Because our Nodes take considerable time to be truly ready for computation
> (i.e. quite a bit of prepreparation is required).
> So, to accomplish this, we use a dynamic Ignite ClusterGroup when we create
> the compute.
>
> ```
> ClusterGroup readyNodes =
> readyForComputeMonitor.getNodesReadyForCompute(ignite.cluster());
> log.debug(dumpClusterGroup(readyNodes));
> return ignite.compute(readyNodes);
> ```
>
> So. My question.
> Does Ignite keep any information that we can use to determine if a Node is
> healthy?
> I.e. some way that we can locate any outliers in the ComputeGrid?
>
> For example, the Node in our recent incident was at 100% CPU and was much,
> much slower in the reduce phase.
>
> Any help/advise would be much appreciated.
>
> Thanks,
> -- Chris
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>


Determine Node Health?

2018-09-04 Thread Chris Berry
Hi,

We are using an Ignite ComputeGrid, and it is mostly working nicely. 

Recently we had a Node with "Noisy Neighbors" in AWS that wrecked havoc in
our ComputeGrid.
Even though that Node was quite slow, it was never removed from the
map/reduce – slowing down all computes.

We have already built a system that allows us to add/subtract Nodes to the
ComputeGrid based on when they are actually “ready to compute”, 
Because our Nodes take considerable time to be truly ready for computation
(i.e. quite a bit of prepreparation is required).
So, to accomplish this, we use a dynamic Ignite ClusterGroup when we create
the compute.

```
ClusterGroup readyNodes =
readyForComputeMonitor.getNodesReadyForCompute(ignite.cluster());
log.debug(dumpClusterGroup(readyNodes));
return ignite.compute(readyNodes);
```

So. My question.
Does Ignite keep any information that we can use to determine if a Node is
healthy?
I.e. some way that we can locate any outliers in the ComputeGrid?

For example, the Node in our recent incident was at 100% CPU and was much,
much slower in the reduce phase.

Any help/advise would be much appreciated.

Thanks, 
-- Chris 





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/