Re: [Ganglia-developers] override_ip causing gmond to crash
I believe this was a problem caused by using the wrong APR pool in the apr_pstrcat() call. https://github.com/ganglia/monitor-core/pull/62 --Nick. -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_sfd2d_oct___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] override_ip causing gmond to crash
Hi, The version in APR instead of homegrown #49 is causing still causing corruption of the host name field on the server that I was having problems with before [1]. The current version in github is ... cb-msg.Ganglia_value_msg_u.gstr.metric_id.host = apr_pstrcat(gm_pool, (char *)( override_ip != NULL ? override_ip : override_hostname ), :, (char *) override_hostname, NULL); I've slightly modified the above version to the following and it seems to work ok... override_ip = override_ip != NULL ? override_ip : override_hostname; cb-msg.Ganglia_value_msg_u.gstr.metric_id.host = apr_pstrcat(gm_pool, override_ip, :, override_hostname, NULL); I assume there is some subtle difference between the two that someone on the developer list could explain to me. Do people think this would be robust enough to work is all cases? Regards, Nick [1] The HOST NAME tag was corrupted as follows... HOST NAME=U\xc2\xa69 IP= REPORTED=1348821943 TN=20 TMAX=20 DMAX=86400 LOCATION=unspecified GMOND_STARTED=0 TAGS=os:Linux datacentre:dev virtual:physical/HOST On Thu, Sep 27, 2012 at 10:23 AM, Nicholas Satterly nfsatte...@gmail.comwrote: Paul, thanks for that. However, I'd be more inclined to get the APR version working as it should. Vladimir, were there specific bug reports for gmond crashing? Or any more information to help us narrow down what the root cause may have been? --Nick. On Wed, Sep 26, 2012 at 9:20 AM, Paul Hewlett paul.hewl...@arm.comwrote: Hi Nicholas ** ** The +1 should be +2 in the malloc() call – one for the terminating null and one for the ‘:’ character. ** ** Regards ** ** ** ** -- Paul Hewlett X25250 http://www.theregister.co.uk/2012/06/25/rbs_natwest_what_went_wrong/ ARM Ltd 110 Fulbourn Road, Cambridge, CB1 9NJ Tel: +44 (0)1223 405923 skype: paul-at-arm www.arm.com ** ** ** ** *From:* Nicholas Satterly [mailto:nfsatte...@gmail.com] *Sent:* 26 September 2012 00:49 *To:* ganglia-developers@lists.sourceforge.net *Subject:* [Ganglia-developers] override_ip causing gmond to crash ** ** Hi, ** ** I've discovered that on some of our systems (perhaps only half a dozen out of 500 or so) gmond crashes if the override_ip configuration option is set. I've worked out that the problem is something to do with this block of code... ** ** #if 1 char* tmpstr = malloc( strlen(( override_ip != NULL ? override_ip : override_hostname )) + strlen( override_hostname ) + 1 );** ** strcpy (tmpstr, (char *)( override_ip != NULL ? override_ip : override_hostname ) ); strcat (tmpstr, :); strcat (tmpstr, (char *) override_hostname); ** ** cb-msg.Ganglia_value_msg_u.gstr.metric_id.host = tmpstr; #endif #if 0 cb-msg.Ganglia_value_msg_u.gstr.metric_id.host = apr_pstrcat(gm_pool, (char *)( override_ip != NULL ? override_ip : override_hostname ), :, (char *) override_hostname, NULL); #endif ** ** What I'm trying to understand at the moment is why the apr_pstrcat version is #if 0 commented out when it seems to work OK during my testing. ** ** Thanks, Nick -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -- Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] override_ip causing gmond to crash
Paul, thanks for that. However, I'd be more inclined to get the APR version working as it should. Vladimir, were there specific bug reports for gmond crashing? Or any more information to help us narrow down what the root cause may have been? --Nick. On Wed, Sep 26, 2012 at 9:20 AM, Paul Hewlett paul.hewl...@arm.com wrote: Hi Nicholas ** ** The +1 should be +2 in the malloc() call – one for the terminating null and one for the ‘:’ character. ** ** Regards ** ** ** ** -- Paul Hewlett X25250 http://www.theregister.co.uk/2012/06/25/rbs_natwest_what_went_wrong/ ARM Ltd 110 Fulbourn Road, Cambridge, CB1 9NJ Tel: +44 (0)1223 405923 skype: paul-at-arm www.arm.com ** ** ** ** *From:* Nicholas Satterly [mailto:nfsatte...@gmail.com] *Sent:* 26 September 2012 00:49 *To:* ganglia-developers@lists.sourceforge.net *Subject:* [Ganglia-developers] override_ip causing gmond to crash ** ** Hi, ** ** I've discovered that on some of our systems (perhaps only half a dozen out of 500 or so) gmond crashes if the override_ip configuration option is set. I've worked out that the problem is something to do with this block of code... ** ** #if 1 char* tmpstr = malloc( strlen(( override_ip != NULL ? override_ip : override_hostname )) + strlen( override_hostname ) + 1 );*** * strcpy (tmpstr, (char *)( override_ip != NULL ? override_ip : override_hostname ) ); strcat (tmpstr, :); strcat (tmpstr, (char *) override_hostname); ** ** cb-msg.Ganglia_value_msg_u.gstr.metric_id.host = tmpstr;* *** #endif #if 0 cb-msg.Ganglia_value_msg_u.gstr.metric_id.host = apr_pstrcat(gm_pool, (char *)( override_ip != NULL ? override_ip : override_hostname ), :, (char *) override_hostname, NULL); #endif ** ** What I'm trying to understand at the moment is why the apr_pstrcat version is #if 0 commented out when it seems to work OK during my testing. ** ** Thanks, Nick -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. -- Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://ad.doubleclick.net/clk;258768047;13503038;j? http://info.appdynamics.com/FreeJavaPerformanceDownload.html___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers
Re: [Ganglia-developers] override_ip causing gmond to crash
IIRC we tried to use APR for portability but we saw crashes in that piece of code on certain platforms (Ubuntu comes to mind). We could try to fix it the right way again. Vladimir On Wed, 26 Sep 2012, Nicholas Satterly wrote: Hi, I've discovered that on some of our systems (perhaps only half a dozen out of 500 or so) gmond crashes if the override_ip configuration option is set. I've worked out that the problem is something to do with this block of code... #if 1 char* tmpstr = malloc( strlen(( override_ip != NULL ? override_ip : override_hostname )) + strlen( override_hostname ) + 1 ); strcpy (tmpstr, (char *)( override_ip != NULL ? override_ip : override_hostname ) ); strcat (tmpstr, :); strcat (tmpstr, (char *) override_hostname); cb-msg.Ganglia_value_msg_u.gstr.metric_id.host = tmpstr; #endif #if 0 cb-msg.Ganglia_value_msg_u.gstr.metric_id.host = apr_pstrcat(gm_pool, (char *)( override_ip != NULL ? override_ip : override_hostname ), :, (char *) override_hostname, NULL); #endif What I'm trying to understand at the moment is why the apr_pstrcat version is #if 0 commented out when it seems to work OK during my testing. Thanks, Nick -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers