Re: [Ganglia-general] Ganglia setup with 100 nodes
Thanks Vladmir for the support. I will request for a MW asap to change the configuration. BTW could you please let me know how the monitoring node actually check the other node status.. is there some heart beat kind of mechanism in ganglia. Thanks Sarath _ From: Vladimir Vuksan <vli...@veus.hr<mailto:vli...@veus.hr>> Sent: Wednesday, December 7, 2016 7:48 PM Subject: Re: [Ganglia-general] Ganglia setup with 100 nodes To: <ganglia-general@lists.sourceforge.net<mailto:ganglia-general@lists.sourceforge.net>> I don't believe that should be an issue but it's puzzling that it just stop working. Can you try switching to unicast to see if that makes a difference. Here is the quickstart document. https://github.com/ganglia/monitor-core/wiki/Ganglia-Quick-Start Vladimir 12/07/2016 u 05:59 AM, Vc, Sarathchandran (Nokia - IN/Bangalore) je napisao/la: We have a 100 node cluster and which is monitored by Ganglia .We are facing some issue on the setup as below. Here is the details Three group cluster group created which is monitored by three multicast IP APP_CLUS 239.2.11.71 10 NODES DB_CLUS 239.2.11.72 80 NODES SUP_CLUS 239.2.11.73 10 NODES Once I have started the gmond service in all the 100 nodes ganglia web page will show all the node status properly . After exact 5 mins from DB_CLUS 70 nodes will say dead and critical .But node is up and running:( .Changed all the configuration multicast IP everything but result is same . If I restarted the gmond in all the nodes again same issue will come after 5 mins. Is there any limitation in multicast IP or ganglia ? We have a 20 node cluster which is working fine without any issues .Only one difference between these two setup is 100 node cluster is running in DNS and 20 node cluster is with local host names. -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/xeonphi___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia setup with 100 nodes
Thanks Vladmir for the support. I will request for a MW asap to change the configuration. BTW could you please let me know how the monitoring node actually check the other node status.. is there some heart beat kind of mechanism in ganglia. Thanks Sarath _ From: Vladimir Vuksan <vli...@veus.hr<mailto:vli...@veus.hr>> Sent: Wednesday, December 7, 2016 7:48 PM Subject: Re: [Ganglia-general] Ganglia setup with 100 nodes To: <ganglia-general@lists.sourceforge.net<mailto:ganglia-general@lists.sourceforge.net>> I don't believe that should be an issue but it's puzzling that it just stop working. Can you try switching to unicast to see if that makes a difference. Here is the quickstart document. https://github.com/ganglia/monitor-core/wiki/Ganglia-Quick-Start Vladimir 12/07/2016 u 05:59 AM, Vc, Sarathchandran (Nokia - IN/Bangalore) je napisao/la: We have a 100 node cluster and which is monitored by Ganglia .We are facing some issue on the setup as below. Here is the details Three group cluster group created which is monitored by three multicast IP APP_CLUS 239.2.11.71 10 NODES DB_CLUS 239.2.11.72 80 NODES SUP_CLUS 239.2.11.73 10 NODES Once I have started the gmond service in all the 100 nodes ganglia web page will show all the node status properly . After exact 5 mins from DB_CLUS 70 nodes will say dead and critical .But node is up and running:( .Changed all the configuration multicast IP everything but result is same . If I restarted the gmond in all the nodes again same issue will come after 5 mins. Is there any limitation in multicast IP or ganglia ? We have a 20 node cluster which is working fine without any issues .Only one difference between these two setup is 100 node cluster is running in DNS and 20 node cluster is with local host names. -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/xeonphi___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia setup with 100 nodes
I don't believe that should be an issue but it's puzzling that it just stop working. Can you try switching to unicast to see if that makes a difference. Here is the quickstart document. https://github.com/ganglia/monitor-core/wiki/Ganglia-Quick-Start Vladimir 12/07/2016 u 05:59 AM, Vc, Sarathchandran (Nokia - IN/Bangalore) je napisao/la: We have a 100 node cluster and which is monitored by Ganglia .We are facing some issue on the setup as below. Here is the details Three group cluster group created which is monitored by three multicast IP APP_CLUS 239.2.11.71 10 NODES DB_CLUS 239.2.11.72 80 NODES SUP_CLUS 239.2.11.73 10 NODES Once I have started the gmond service in all the 100 nodes ganglia web page will show all the node status properly . After exact 5 mins from DB_CLUS 70 nodes will say dead and critical .But node is up and runningL .Changed all the configuration multicast IP everything but result is same . If I restarted the gmond in all the nodes again same issue will come after 5 mins. Is there any limitation in multicast IP or ganglia ? We have a 20 node cluster which is working fine without any issues .Only one difference between these two setup is 100 node cluster is running in DNS and 20 node cluster is with local host names. -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/xeonphi___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general
Re: [Ganglia-general] Ganglia setup with 100 nodes
Hello, We have a 100 node cluster and which is monitored by Ganglia .We are facing some issue on the setup as below. Here is the details Three group cluster group created which is monitored by three multicast IP APP_CLUS 239.2.11.71 10 NODES DB_CLUS239.2.11.72 80 NODES SUP_CLUS 239.2.11.73 10 NODES Once I have started the gmond service in all the 100 nodes ganglia web page will show all the node status properly . After exact 5 mins from DB_CLUS 70 nodes will say dead and critical .But node is up and running :( .Changed all the configuration multicast IP everything but result is same . If I restarted the gmond in all the nodes again same issue will come after 5 mins. Is there any limitation in multicast IP or ganglia ? We have a 20 node cluster which is working fine without any issues .Only one difference between these two setup is 100 node cluster is running in DNS and 20 node cluster is with local host names. Thanks & Regards, Sarathchandran -- Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today.http://sdm.link/xeonphi___ Ganglia-general mailing list Ganglia-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-general