RE: Determine Node Health?
You’re creating a new cache on each heath check call and never destroy them – of course, that leads to a memory leak; it’s also awful for the performance. Don’t create a new cache each time. If you really want to check that cache operations work, use the same one every time. Thanks, Stan From: Jason.G Sent: 10 октября 2018 г. 8:49 To: user@ignite.apache.org Subject: Re: Determine Node Health? Hi vgrigorev, I used your suggestion to do health check for each node. But I got memory leak issue and exit with OOM error: java heap space. Below is my example code: // I create one bean to collect what I want info which include IP, hostname, createtime and then return json string. IgniteHealthCheckEntity healthCheck = new IgniteHealthCheckEntity(); ClusterNode node = ignite.cluster().localNode(); List adresses = (List)node.addresses(); String ip = adresses.get(0); List hostnames = (List)node.hostNames(); String hostname = hostnames.get(0); healthCheck.setServerIp(ip); healthCheck.setStatus(0); healthCheck.setServerHostname(hostname); healthCheck.setMonitorTime(monitorTime); healthCheck.setClientIp(clientIp); String cacheName = "test_monitor_" + ipStr + "_"+ new Date().getTime(); IgniteCache putCache = ignite.createCache(cacheName); putCache.put("test", "test"); String value = putCache.get("test"); if(!"test".equals(value)) { message = "Ignite ("+ ip +") " + "get/put value failed"; healthCheck.setMessage(message); return JSONObject.fromObject(healthCheck).toString(); }else { message = "OKOKOK"; healthCheck.setMessage(message); return JSONObject.fromObject(healthCheck).toString(); } -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: Determine Node Health?
Hi vgrigorev, I used your suggestion to do health check for each node. But I got memory leak issue and exit with OOM error: java heap space. Below is my example code: // I create one bean to collect what I want info which include IP, hostname, createtime and then return json string. IgniteHealthCheckEntity healthCheck = new IgniteHealthCheckEntity(); ClusterNode node = ignite.cluster().localNode(); List adresses = (List)node.addresses(); String ip = adresses.get(0); List hostnames = (List)node.hostNames(); String hostname = hostnames.get(0); healthCheck.setServerIp(ip); healthCheck.setStatus(0); healthCheck.setServerHostname(hostname); healthCheck.setMonitorTime(monitorTime); healthCheck.setClientIp(clientIp); String cacheName = "test_monitor_" + ipStr + "_"+ new Date().getTime(); IgniteCache putCache = ignite.createCache(cacheName); putCache.put("test", "test"); String value = putCache.get("test"); if(!"test".equals(value)) { message = "Ignite ("+ ip +") " + "get/put value failed"; healthCheck.setMessage(message); return JSONObject.fromObject(healthCheck).toString(); }else { message = "OKOKOK"; healthCheck.setMessage(message); return JSONObject.fromObject(healthCheck).toString(); } -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: Determine Node Health?
Thanks to you both! -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: Determine Node Health?
I would propose to make periodic call to all nodes one by one with some simple remote function. Measure time or each node responce, and if it is low for some node according to your needs, avoid using this node for some period. How to choose nodes for call, single or many: IgniteCompute compute = ignite.compute(ignite.cluster().forNodeIds( set UUID here )); final Collection mapKexs = compute.broadcast( new IgniteCallable() { // Inject Ignite instance. @IgniteInstanceResource private Ignite ignite; @Override public String call() throws Exception { log.debug(" DIAGNOSTICS: node is `{}`", ignite.cluster().localNode().consistentId() , url); return ignite.cluster().localNode().consistentId() ; } }); -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Re: Determine Node Health?
Hello Chris, There is no such metric as "node is healthy" now, but each node provides a lot of low-level metrics such as CPU usage, memory usage, jobs execution/waiting time etc, which you can combine and define your own criteria of "healthy node". These metrics available cluster-wide and contains information for each node, see ClusterGroup#metrics(), ClusterNode#metrics() methods. ср, 5 сент. 2018 г. в 0:39, Chris Berry : > Hi, > > We are using an Ignite ComputeGrid, and it is mostly working nicely. > > Recently we had a Node with "Noisy Neighbors" in AWS that wrecked havoc in > our ComputeGrid. > Even though that Node was quite slow, it was never removed from the > map/reduce – slowing down all computes. > > We have already built a system that allows us to add/subtract Nodes to the > ComputeGrid based on when they are actually “ready to compute”, > Because our Nodes take considerable time to be truly ready for computation > (i.e. quite a bit of prepreparation is required). > So, to accomplish this, we use a dynamic Ignite ClusterGroup when we create > the compute. > > ``` > ClusterGroup readyNodes = > readyForComputeMonitor.getNodesReadyForCompute(ignite.cluster()); > log.debug(dumpClusterGroup(readyNodes)); > return ignite.compute(readyNodes); > ``` > > So. My question. > Does Ignite keep any information that we can use to determine if a Node is > healthy? > I.e. some way that we can locate any outliers in the ComputeGrid? > > For example, the Node in our recent incident was at 100% CPU and was much, > much slower in the reduce phase. > > Any help/advise would be much appreciated. > > Thanks, > -- Chris > > > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >
Determine Node Health?
Hi, We are using an Ignite ComputeGrid, and it is mostly working nicely. Recently we had a Node with "Noisy Neighbors" in AWS that wrecked havoc in our ComputeGrid. Even though that Node was quite slow, it was never removed from the map/reduce – slowing down all computes. We have already built a system that allows us to add/subtract Nodes to the ComputeGrid based on when they are actually “ready to compute”, Because our Nodes take considerable time to be truly ready for computation (i.e. quite a bit of prepreparation is required). So, to accomplish this, we use a dynamic Ignite ClusterGroup when we create the compute. ``` ClusterGroup readyNodes = readyForComputeMonitor.getNodesReadyForCompute(ignite.cluster()); log.debug(dumpClusterGroup(readyNodes)); return ignite.compute(readyNodes); ``` So. My question. Does Ignite keep any information that we can use to determine if a Node is healthy? I.e. some way that we can locate any outliers in the ComputeGrid? For example, the Node in our recent incident was at 100% CPU and was much, much slower in the reduce phase. Any help/advise would be much appreciated. Thanks, -- Chris -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/