Okay.

So I think I've found a solution.

I grabbed the latest copy of the net-snmp source code from the SVN repository.

I applied patches 1794532, 1792716 and 1805971 to the source code but no luck.  
Was not able to keep the agent up and running for more then 30 seconds.

I then applied patch 1712645 (from the unoffical patch section (entitled: 
meaningful log message on duplicate IP address) and the agent is up and stable.
All walks work and the agent has been up for over 13 minutes.

I hope this helps others out in the environment that are having the same issues 
with the agent seg faulting upon startup.

Jayson
From: [EMAIL PROTECTED]
To: [email protected]
Subject: RE: More net-snmp 5.4.1 startup issues.
Date: Tue, 6 Nov 2007 11:05:02 -0500








Dave,

I applied the zones patch last night to my copy of the source code and still 
was having issues with it crashing.  I downloaded the net-snmp 5.4.1 code this 
morning and applied  the diff-zones patch(bug 1794532) and the agent was dying 
with the SIGSEGV error noted below.

I then went on and patched it with the code from bug 1792716 
(diff.ipaddress-patch-541) and now it no longer dies but consumes 92 - 93 % of 
the cpu and will not answer requests (One of the issues that I noted yesterday).

Now for the bad news.  I cannot remove the duplicate IP addresses as thats the 
way linux handles bonded ip addresses.  They all show up but the two ethernet 
devices which are bonded show up as slaves while the bonded device shows up as 
the master.

GDB OUTPUT BELOW FOR OUTPUT PRIOR TO PATCH AFTER SNMPD Crashed.

Program received signal SIGSEGV, Segmentation fault.
0x0089e390 in _int_malloc () from /lib/tls/libc.so.6
(gdb) where
#0  0x0089e390 in _int_malloc () from /lib/tls/libc.so.6
#1  0x0089fc76 in calloc () from /lib/tls/libc.so.6
#2  0x00bc57ee in ipSystemStatsTable_allocate_rowreq_ctx (data=0x843eab8, 
user_init_ctx=0x3)
    at ip-mib/ipSystemStatsTable/ipSystemStatsTable_interface.c:426
#3  0x00bc78d7 in _add_new (systemstats_entry=0x843eab8, container=0x83a9230)
    at ip-mib/ipSystemStatsTable/ipSystemStatsTable_data_access.c:263
#4  0x00d0d19b in _ba_for_each (container=0x3, f=0xbc788d <_add_new>, 
context=0x83a9230) at container_binary_array.c:342
#5  0x00bc7c11 in ipSystemStatsTable_container_load (container=0x83a9230)
    at ip-mib/ipSystemStatsTable/ipSystemStatsTable_data_access.c:377
#6  0x00bc6af8 in _cache_load (cache=0x83a91f0, vmagic=0x83a9230) at 
ip-mib/ipSystemStatsTable/ipSystemStatsTable_interface.c:1212
#7  0x004add54 in _cache_load (cache=0x83a91f0) at cache_handler.c:537
#8  0x00cf396f in run_alarms () at snmp_alarm.c:252
#9  0x0804c003 in main (argc=11, argv=0xbfec4c34) at snmpd.c:1210
#10 0x00850e23 in __libc_start_main () from /lib/tls/libc.so.6
#11 0x08049f41 in _start ()
(gdb) list
1210            run_alarms();
1211
1212            netsnmp_check_outstanding_agent_requests();
1213
1214        }                           /* endwhile */
1215
1216        snmp_log(LOG_INFO, "Received TERM or STOP signal...  shutting 
down...\n");
1217        return 0;
1218
1219    }                               /* end receive() */
(gdb) 


GDB OUTPUT AFTER PATCH FROM bug 1792716  (Attached to process)
#0  0x0089e090 in _int_malloc () from /lib/tls/libc.so.6
#1  0x0089ff01 in malloc () from /lib/tls/libc.so.6
#2  0x0014a0ff in _sess_read (sessp=0x9977b58, fdset=0xbff33ec0) at 
snmp_api.c:5567
#3  0x0014ad0b in snmp_sess_read (sessp=0x9977b58, fdset=0x38) at 
snmp_api.c:5791
#4  0x0014ad59 in snmp_read (fdset=0xbff33ec0) at snmp_api.c:5408
#5  0x0804bffe in main (argc=10, argv=0xbff34064) at snmpd.c:1180
#6  0x00850e23 in __libc_start_main () from /lib/tls/libc.so.6
#7  0x08049f41 in _start ()
(gdb) list
1180                  snmp_read(&readfds);
1181                }
1182            } else
1183                switch (count) {
1184                case 0:
1185                    snmp_timeout();
1186                    break;
1187                case -1:
1188                    DEBUGMSGTL(("snmpd/select", "  errno = %d\n", errno));
1189                    if (errno == EINTR) {

Jayson

> Date: Tue, 6 Nov 2007 09:58:07 +0000
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: Re: More net-snmp 5.4.1 startup issues.
> CC: [email protected]
> 
> On 05/11/2007, Jayson Robinson <[EMAIL PROTECTED]> wrote:
> > Actually I had to sanitize the data so that it could be exported.  They're
> > very careful about IP addresses leaving the building.
> 
> That's fair enough.
> 
> > I can confirm that bond0 / eth0 and eth4 all share the same IP address
> > though.
> 
> This does sound more and more as if duplicate IP addresses is the cause.
> 
> 
> > Now I just patched it with 2 various patches this morning:
> >
> > official patch: 1805971
> 
> That's not relevant to this particular problem.
> (Though it is worth applying anyway)
> 
> > and
> > diff.ipaddress-patch-541
> 
> I presume you mean the patch from Bugs #1794532/1792716 ?
> [It's always worth referring to tracker numbers, rather than patch
> file names!]
> 
> From the discussion in Bug #1794532, it sounds as if this patch
> does not fix the problem either.  It definitely sounds as if the later
> file diff.zones-541.pat is more promising.
> 
> Dave

Climb to the top of the charts!  Play Star Shuffle:  the word scramble 
challenge with star power. Play Now!

_________________________________________________________________
Boo! Scare away worms, viruses and so much more! Try Windows Live OneCare!
http://onecare.live.com/standard/en-us/purchase/trial.aspx?s_cid=wl_hotmailnews
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Net-snmp-coders mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/net-snmp-coders

Reply via email to