Based on the timeout from source messages you are either having network connectivity issues polling gmonds or they are down.

Vladimir

On 03/17/2015 12:46 AM, 潇湘居士 wrote:
Hi,

    my gmetad(3.6.1) suddenly stoped, and it has passed much time when I find the stop status.
    here is the log:
    ----------------------------------------------------------------------------------------------------------------------------------------
[root@ca5 log]# grep -v RRD_update messages | tail -n 20
Mar 16 15:30:31 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 0 for [hb] data source after 0 bytes read
Mar 16 15:30:46 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 1 for [hb] data source after 0 bytes read
Mar 16 15:31:01 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 0 for [hb] data source after 0 bytes read
Mar 16 15:54:40 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 1 for [hb] data source after 0 bytes read
Mar 16 15:54:48 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 1 for [dp] data source after 0 bytes read
Mar 16 15:54:54 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 1 for [stat] data source after 43261 bytes read
Mar 16 15:54:56 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 0 for [hb] data source after 0 bytes read
Mar 16 15:55:09 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 0 for [test] data source after 5427 bytes read
Mar 16 15:55:10 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 0 for [dp] data source after 11584 bytes read
Mar 16 15:55:27 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 1 for [dp] data source after 0 bytes read
Mar 16 15:55:31 ca5 /usr/sbin/gmetad[28087]: poll() timeout from source 1 for [hb] data source after 0 bytes read
Mar 16 18:26:22 ca5 last message repeated 2 times
Mar 16 18:28:23 ca5 kernel: gmetad[28126]: segfault at 000000003fc22580 rip 0000003f1320ba5f rsp 0000000059073790 error 4
Mar 16 19:58:23 ca5 auditd[3025]: Audit daemon rotating log files
Mar 17 00:54:25 ca5 Server Administrator: Storage Service EventID: 2243  The Patrol Read has stopped.:  Controller 0 (PERC H700 Integrated)
Mar 17 01:05:29 ca5 auditd[3025]: Audit daemon rotating log files
Mar 17 01:30:04 ca5 auditd[3025]: Audit daemon rotating log files
Mar 17 02:00:10 ca5 /usr/sbin/gmetad[6865]: data_thread() for [db] failed to contact node 66.160.159.72
Mar 17 02:06:14 ca5 last message repeated 3 times
Mar 17 03:13:29 ca5 last message repeated 2 times
    ----------------------------------------------------------------------------------------------------------------------------------------

my system is RHEL 5.5:
----------------------------------------------------------------------------------------------------------------------------------------
[root@ca5 log]# lsb_release -a
LSB Version:    :core-3.1-amd64:core-3.1-ia32:core-3.1-noarch:graphics-3.1-amd64:graphics-3.1-ia32:graphics-3.1-noarch
Distributor ID:    RedHatEnterpriseServer
Description:    Red Hat Enterprise Linux Server release 5.5 (Tikanga)
Release:    5.5
Codename:    Tikanga
----------------------------------------------------------------------------------------------------------------------------------------

And I don't know why did gmetad stop, can anyone help me ?


------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/


_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general



------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to