I have slowly re-cofig'd aixdisk.conf into 3 prod machines for a total of 2154 
metrics among 3 hosts and gmond did puke for a brief 15 seconds, then rrd was 
able to display properly again. This happened when I hit 'get fresh data', one 
time.  I may reduce it down to 2 prod hosts totaling (2154-264) metrics, but 
has anyone started to research this gmond problem?

Thanks!


Message: 3
Date: Wed, 18 Sep 2013 08:24:56 -0400
From: Derek Smith <[email protected]>
Subject: Re: [Ganglia-general] gmond core dumping, again on head node.
To: Derek Smith <[email protected]>,
        "[email protected]"
        <[email protected]>
Message-ID:
        <[email protected]>
Content-Type: text/plain; charset="us-ascii"

It seems that the problem stems from the aixdisk.conf and its C code.  I 
renamed aixdisk.conf and restarted gmond on all my hosts and gmond has stayed 
up for over 12 hours.
If anyone needs the core file, let me know!  Thx!

From: Derek Smith
Sent: Tuesday, September 17, 2013 02:07 PM
To: [email protected]
Subject: gmond core dumping, again on head node.

Ever since my upgrade to 3.6 gmond is very shaky to say the least...gmond keeps 
seg faulting.  I have the core file if needed!  Any help much appreciated!
Thank you!



My ENV is:

AIX
6100-08-03-1339
gmond 3.6.0
gmetad 3.6.0
web front-end "3.5.10";
Server version: Apache/2.4.3 (Unix)
RRDtool 1.4.8  Copyright 1997-2013 by Tobias Oetiker 
<[email protected]<mailto:[email protected]>>
gmond rrdcache: "/var/lib/ganglia/rrdcached/rrdcached.socket";
gmetad rrdcache: RRDCACHED_ADDRESS=/var/lib/ganglia/rrdcached/rrdcached.socket


Error report details

# cat php-errors.log
[05-Sep-2013 13:59:26 America/Detroit] PHP Notice:  Undefined index: hreg in 
/var/www/htdocs/ganglia3510/ganglia-web-3.5.10/graph_all_periods.php on line 84
[05-Sep-2013 14:05:06 America/Detroit] PHP Notice:  Undefined index: hreg in 
/var/www/htdocs/ganglia3510/ganglia-web-3.5.10/graph_all_periods.php on line 843


CORE FILE NAME
/var/adm/ras/corefiles/core.9371670.17154125
PROGRAM NAME
gmond
STACK EXECUTION DISABLED
           0
COME FROM ADDRESS REGISTER
rmgr_disa FFFFF9B4

PROCESSOR ID
  hw_fru_id: 0
  hw_cpu_id: 4

ADDITIONAL INFORMATION
extend_br 238
extend_br 1E8

Symptom Data
REPORTABLE
1
INTERNAL ERROR
0
SYMPTOM CODE
PCSS/SPI2 FLDS/gmond SIG/11 FLDS/extend_br VALU/238 FLDS/rmgr_disa


Syslog details, core dump 1215-ish ESDT

Sep 17 12:14:31 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod e 10.255.9.12 Sep 17 
12:14:31 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() got no answer from any [IBMpower] da tasource Sep 17 12:14:45 
ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: data_thread() for 
[IBMpower] failed to contact nod e 10.255.9.12 Sep 17 12:14:45 ganglia01ap 
user:info /opt/freeware/sbin/gmetad[8192110]: data_thread() got no answer from 
any [IBMpower] da tasource Sep 17 12:14:57 ganglia01ap daemon:info 
xntpd[4063412]: synchronized to 10.1.1.200, stratum=1 Sep 17 12:15:00 
ganglia01ap daemon:notice ConfigRM[7340166]: (Recorded using libct_ffdc.a cv 
2):::Error ID: :::Reference ID:
  :::Template ID: de84c4db:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,347                 :::CONFIGRM_STARTED_S
T IBM.ConfigRM daemon has started.
Sep 17 12:15:00 ganglia01ap daemon:err|error ConfigRM[7340166]: (Recorded using 
libct_ffdc.a cv 2):::Error ID: :::Reference
ID:  :::Template ID: 6895a4e3:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,506                 :::CONFIGRM_ERROR_
ER An internal error was encountered in the configuration manager daemon 
(IBM.ConfigRMd). Error Code 00018001 Message Catalo g Name ct_rmf.cat Message 
Set 1 Message Identifier 7 Message Inserts 00000005 Sep 17 12:15:00 ganglia01ap 
daemon:notice ConfigRM[7340168]: (Recorded using libct_ffdc.a cv 2):::Error ID: 
:::Reference ID:
  :::Template ID: de84c4db:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,347                 :::CONFIGRM_STARTED_S
T IBM.ConfigRM daemon has started.
Sep 17 12:15:00 ganglia01ap daemon:err|error ConfigRM[7340168]: (Recorded using 
libct_ffdc.a cv 2):::Error ID: :::Reference
ID:  :::Template ID: 6895a4e3:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,506                 :::CONFIGRM_ERROR_
ER An internal error was encountered in the configuration manager daemon 
(IBM.ConfigRMd). Error Code 00018001 Message Catalo g Name ct_rmf.cat Message 
Set 1 Message Identifier 7 Message Inserts 00000005 Sep 17 12:15:01 ganglia01ap 
daemon:notice ConfigRM[7340170]: (Recorded using libct_ffdc.a cv 2):::Error ID: 
:::Reference ID:
  :::Template ID: de84c4db:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,347                 :::CONFIGRM_STARTED_S
T IBM.ConfigRM daemon has started.
Sep 17 12:15:01 ganglia01ap daemon:err|error ConfigRM[7340170]: (Recorded using 
libct_ffdc.a cv 2):::Error ID: :::Reference
ID:  :::Template ID: 6895a4e3:::Details File:  :::Location: 
RSCT,IBM.ConfigRMd.C,1.57,506                 :::CONFIGRM_ERROR_
ER An internal error was encountered in the configuration manager daemon 
(IBM.ConfigRMd). Error Code 00018001 Message Catalo g Name ct_rmf.cat Message 
Set 1 Message Identifier 7 Message Inserts 00000005 Sep 17 12:15:01 ganglia01ap 
user:info /opt/freeware/sbin/gmetad[8192110]: data_thread() for [IBMpower] 
failed to contact nod e 10.255.9.12 Sep 17 12:15:01 ganglia01ap user:info 
/opt/freeware/sbin/gmetad[8192110]: data_thread() got no answer from any 
[IBMpower] da tasource Sep 17 12:15:16 ganglia01ap user:info 
/opt/freeware/sbin/gmetad[8192110]: data_thread() for [IBMpower] failed to 
contact nod e 10.255.9.12 Sep 17 12:15:16 ganglia01ap user:info 
/opt/freeware/sbin/gmetad[8192110]: data_thread() got no answer from any 
[IBMpower] da tasource Sep 17 12:15:29 ganglia01ap daemon:info xntpd[4063412]: 
synchronized to 10.1.1.201, stratum=1 Sep 17 12:15:31 ganglia01ap user:info 
/opt/freeware/sbin/gmetad[8192110]: data_thread() for [IBMpower] failed to 
contact nod e 10.255.9.12 Sep 17 12:15:31 
 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: data_thread() got no 
answer from any [IBMpower] da tasource Sep 17 12:15:46 ganglia01ap user:info 
/opt/freeware/sbin/gmetad[8192110]: data_thread() for [IBMpower] failed to 
contact nod e 10.255.9.12 Sep 17 12:15:46 ganglia01ap user:info 
/opt/freeware/sbin/gmetad[8192110]: data_thread() got no answer from any 
[IBMpower] da tasource Sep 17 12:16:01 ganglia01ap daemon:info xntpd[4063412]: 
synchronized to 10.1.1.200, stratum=1 Sep 17 12:16:01 ganglia01ap user:info 
/opt/freeware/sbin/gmetad[8192110]: data_thread() for [IBMpower] failed to 
contact nod e 10.255.9.12 Sep 17 12:16:01 ganglia01ap user:info 
/opt/freeware/sbin/gmetad[8192110]: data_thread() got no answer from any 
[IBMpower] da tasource Sep 17 12:16:08 ganglia01ap aso:notice aso[15073350]: 
[HIB] Used entitlement per unfolded vCPU is below threshold (13% of a core).
Sep 17 12:16:08 ganglia01ap aso:notice aso[15073350]: [HIB] Cache optimizations 
will hibernate until used entitlement is at least 30% of a core per unfolded 
vCPU Sep 17 12:16:16 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() for [IBMpower] failed to contact nod e 10.255.9.12 Sep 17 
12:16:16 ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: 
data_thread() got no answer from any [IBMpower] da tasource Sep 17 12:16:31 
ganglia01ap user:info /opt/freeware/sbin/gmetad[8192110]: data_thread() for 
[IBMpower] failed to contact nod e 10.255.9.12
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint 
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes 
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. 
https://urldefense.proofpoint.com/v1/url?u=http://pubads.g.doubleclick.net/gampad/clk?id%3D58041151%26iu%3D/4140/ostg.clktrk&k=j2AJn6IkQ79ZgTSu1WDHyg%3D%3D%0A&r=r7kjoOqrPUEbvJC8fa50N7BUshlePBUb7tm6tw5oE5c%3D%0A&m=3PFDhfBbaZzmzGnhwdQ6JZzimeflixp%2BtIu0eHnGO84%3D%0A&s=21966c88ebba47f5c50c5ff8ccaee51c3b90a4bf8977656baed07e6b30b89594

------------------------------

_______________________________________________
Ganglia-general mailing list
[email protected]
https://urldefense.proofpoint.com/v1/url?u=https://lists.sourceforge.net/lists/listinfo/ganglia-general&k=j2AJn6IkQ79ZgTSu1WDHyg%3D%3D%0A&r=r7kjoOqrPUEbvJC8fa50N7BUshlePBUb7tm6tw5oE5c%3D%0A&m=3PFDhfBbaZzmzGnhwdQ6JZzimeflixp%2BtIu0eHnGO84%3D%0A&s=3414d1996055c78694d2b331cfceaf763eb059ed10ebf1ac75ae47faef156b93


End of Ganglia-general Digest, Vol 88, Issue 8
**********************************************

------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13. 
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to