Dave,
Actually I had to sanitize the data so that it could be exported. They're very
careful about IP addresses leaving the building.
I can confirm that bond0 / eth0 and eth4 all share the same IP address though.
When I try duplicate IP addresses I am seeing the insert -1. I'm eventually
seeing this in the logs too right before the net-snmp agent dies.
error on subcontainer remove (-1)
Received SNMP packet(s) from UDP: [127.0.0.1]:32790
error on subcontainer 'ia_addr' insert (-1)
error on subcontainer 'ia_index' insert (-1)
error on subcontainer 'ia_addr' insert (-1)
error on subcontainer 'ia_index' insert (-1)
netsnmp_assert (((void *)0) != lhs) && (((void *)0) != rhs) failed
ip-mib/data_access/ipaddress_ioctl.c:103 netsnmp_ioctl_ipaddress_entry_copy()
arch ipaddress copy failed
error on subcontainer remove (-1)
couldn't map value 1 for ipAddressAddrType
A backtrace reveals the following:
Program received signal SIGABRT, Aborted.
0x008237a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb) where
#0 0x008237a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x008637d5 in raise () from /lib/tls/libc.so.6
#2 0x00865149 in abort () from /lib/tls/libc.so.6
#3 0x0089727a in __libc_message () from /lib/tls/libc.so.6
#4 0x0089dabf in _int_free () from /lib/tls/libc.so.6
#5 0x0089de3a in free () from /lib/tls/libc.so.6
#6 0x006e3283 in netsnmp_access_ipaddress_entry_free () from
/opt/net-snmp//lib/libnetsnmpmibs.so.15
#7 0x006980be in ipAddressTable_release_data () from
/opt/net-snmp//lib/libnetsnmpmibs.so.15
#8 0x00698227 in ipAddressTable_rowreq_ctx_cleanup () from
/opt/net-snmp//lib/libnetsnmpmibs.so.15
#9 0x006c3f67 in ipAddressTable_release_rowreq_ctx () from
/opt/net-snmp//lib/libnetsnmpmibs.so.15
#10 0x006c8758 in ipAddressTable_container_load () from
/opt/net-snmp//lib/libnetsnmpmibs.so.15
#11 0x006c74fd in ipAddressTable_allocate_rowreq_ctx () from
/opt/net-snmp//lib/libnetsnmpmibs.so.15
#12 0x00198d54 in netsnmp_is_cache_valid () from
/opt/net-snmp//lib/libnetsnmphelpers.so.15
#13 0x003b396f in run_alarms () from /opt/net-snmp//lib/libnetsnmp.so.15
#14 0x0804c003 in SnmpdCatchRandomSignal ()
#15 0x00850e23 in __libc_start_main () from /lib/tls/libc.so.6
#16 0x08049f41 in ?? ()
So it does appear that the same IP address on multiple devices will kill the
agent on an unpatched agent.
Now I just patched it with 2 various patches this morning:
official patch: 1805971
and
diff.ipaddress-patch-541
and when running it standalone it still dies however when running it takes the
cpu usage up to 92-3% and sits there.
12430 root 25 0 9004 3896 2376 R 92.0 1.6 1:32.74 snmpd
A backtrace shows the following:
0x0089e01b in _int_malloc () from /lib/tls/libc.so.6
(gdb) where
#0 0x0089e01b in _int_malloc () from /lib/tls/libc.so.6
#1 0x0089fc76 in calloc () from /lib/tls/libc.so.6
#2 0x0034d247 in netsnmp_access_ipaddress_ioctl_get_interface_count (sd=10,
ifc=0x1) at ip-mib/data_access/ipaddress_ioctl.c:513
#3 0x00349bde in netsnmp_access_interface_ioctl_has_ipv4 (sd=10,
if_name=0xbfe1a904 "lo", if_index=0, flags=0xbfe1a7fc)
at if-mib/data_access/interface_ioctl.c:443
#4 0x00348106 in netsnmp_arch_interface_container_load (container=0x9160de8,
load_flags=0)
at if-mib/data_access/interface_linux.c:551
#5 0x003204e1 in netsnmp_access_interface_container_load (container=0x9160de8,
load_flags=0)
at if-mib/data_access/interface.c:159
#6 0x003263cf in ifTable_container_load (container=0x90c6308) at
if-mib/ifTable/ifTable_data_access.c:381
#7 0x00325015 in _cache_load (cache=0x90c62a8, vmagic=0x90c6308) at
if-mib/ifTable/ifTable_interface.c:1838
#8 0x005c7d54 in _cache_load (cache=0x90c62a8) at cache_handler.c:537
#9 0x0016b96f in run_alarms () at snmp_alarm.c:252
#10 0x0804c003 in main (argc=11, argv=0xbfe1af64) at snmpd.c:1210
#11 0x00850e23 in __libc_start_main () from /lib/tls/libc.so.6
#12 0x08049f41 in _start ()
(gdb) list
1210 run_alarms();
1211
1212 netsnmp_check_outstanding_agent_requests();
1213
1214 } /* endwhile */
1215
1216 snmp_log(LOG_INFO, "Received TERM or STOP signal... shutting
down...\n");
1217 return 0;
1218
1219 } /* end receive() */
It definately looks like it's related to bug 1792716, 1794532 however patching
the agent does not seem to help matters. I will try the zones patch at this
point
Thanks,
Jayson
> Date: Mon, 5 Nov 2007 14:55:12 +0000
> From: [EMAIL PROTECTED]
> To: [EMAIL PROTECTED]
> Subject: Re: More net-snmp 5.4.1 startup issues.
> CC: [email protected]
>
> On 30/10/2007, Jayson Robinson <[EMAIL PROTECTED]> wrote:
> > I'm having issues within my environment in some areas where net-snmp 5.4.1
> > isn't starting up correctly. Once in awhile it will start up correctly but
> > more often then not it dies within a minute of starting it up with a seg
> > fault.
>
>
> > error on subcontainer 'ia_addr' insert (-1)
> > error on subcontainer 'ia_index' insert (-1)
>
> I'm not really familiar with this section of the agent, but
> it has been suggested to us that this problem may arise
> when two or more interfaces share the same IP address.
>
> Checking your file 'ifconfig.txt', it appears that eth0 and eth4 both
> have the IP address 1.2.3.4. (As does bond0)
>
> Could you try changing eth4 to use a different address
> (and temporarily drop the bond0 interfaces altogether).
>
> Does that make any difference?
>
> Dave
_________________________________________________________________
Windows Live Hotmail and Microsoft Office Outlook – together at last. Get it
now.
http://office.microsoft.com/en-us/outlook/HA102225181033.aspx?pid=CL100626971033-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Net-snmp-coders mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/net-snmp-coders