Hi, and again a weird issue. nfdump 1.6rc2 has been running fine since a couple of weeks (I got the .tgz from Peter on Nov 25th and installed it pretty much immediately). The box was rebooted today (after having 1000+ days uptime), suddenly nfcapd started to segfault on the first packet it received. nfcapd is started by nfsen 1.3.2 like this:
/usr/local/bin/nfcapd -w -I csr1_2wr -p 9803 -u monitor -g www -B 200000 -S 1 -l /gfiler/nfsen/profiles-data/live/csr1_2wr -P /gfiler/nfsen/var/run/csr1_2wr.pid -z Log: Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: 2 byte input/output interface index Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: 4 byte input/output interface index Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: 2 byte src/dst AS number Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: 4 byte src/dst AS number Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: dst tos, direction, src/dst mask Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: IPv4 next hop Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: IPv6 next hop Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: IPv4 BGP next IP Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: IPv6 BGP next IP Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: src/dst vlan id Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: 4 byte output packets Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: 8 byte output packets Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: 4 byte output bytes Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: 8 byte output bytes Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: 4 byte aggregated flows Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: 8 byte aggregated flows Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: in src/out dst mac address Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: in dst/out src mac address Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: MPLS Labels Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: IPv4 router IP addr Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: IPv6 router IP addr Dec 14 22:42:24 flowbert nfcapd[7747]: Add extension: router ID Dec 14 22:42:24 flowbert nfcapd[7747]: Bound to IPv4 host/IP: any, Port: 9803 Dec 14 22:42:24 flowbert nfcapd[7747]: Standard setsockopt, SO_RCVBUF is 137216 Requested length is 200000 bytes Dec 14 22:42:24 flowbert nfcapd[7747]: System set setsockopt, SO_RCVBUF to 262142 bytes Dec 14 22:42:24 flowbert nfcapd[7747]: Startup. Dec 14 22:42:24 flowbert nfcapd[7747]: Init v9: Recognised number of v9 tags: 54 Dec 14 22:42:24 flowbert nfcapd[7747]: Process_v9: New exporter domain 0 from: 129.187.0.1 Dec 14 22:42:29 flowbert nfcapd[7747]: Process_v9: New exporter domain 517 from: 129.187.0.1 Dec 14 22:42:29 flowbert nfcapd[7747]: Process_v9: [517] Add template 258 Dec 14 22:42:29 flowbert nfcapd[7747]: Process_v9: [517] Add template 257 Dec 14 22:42:29 flowbert nfcapd[7747]: Process_v9: [517] Add template 256 Dec 14 22:42:29 flowbert kernel: nfcapd[7747]: segfault at 00002aaaab46ade0 rip 0000000000416a04 rsp 00007fffffe9e500 error 4 gdb backtrace: Program received signal SIGSEGV, Segmentation fault. 0x0000000000416a04 in Process_v9_data (exporter=0x553c60, data_flowset=<value optimized out>, fs=0x553030, table=0x553f70) at netflow_v9.c:1272 1272 if ( *((uint32_t *)&out[table->engine_offset]) == 0 ) { (gdb) bt full #0 0x0000000000416a04 in Process_v9_data (exporter=0x553c60, data_flowset=<value optimized out>, fs=0x553030, table=0x553f70) at netflow_v9.c:1272 output_offset = <value optimized out> data_record = (common_record_t *) 0x2aaaaaf03034 start_time = 1260827029808 end_time = 1260827029808 packets = 1 bytes = 81 sampling_rate = 1 size_left = 1232 First = <value optimized out> Last = <value optimized out> in = (uint8_t *) 0x555e54 "�...@���@�" i = 5586016 string = 0x100 <Address 0x100 out of bounds> #1 0x0000000000418e46 in Process_v9 (in_buff=0x5556e0, in_buff_cnt=0, fs=0x553030) at netflow_v9.c:1495 table = <value optimized out> exporter = (exporter_domain_t *) 0x553c60 common_header = (common_header_t *) 0x555e50 distance = <value optimized out> flowset_id = 256 flowset_length = 2873535956 size_left = 1236 pkg_num = 14 #2 0x0000000000404854 in main (argc=<value optimized out>, argv=0x7ffffff444c0) at nfcapd.c:681 pid = <value optimized out> s = "\000\000\000\000\000\000\000\000\001\000�\201\000\000\000\000\000\000\000#\001\000\000\0000\035���*\000" len = <value optimized out> pidf = <value optimized out> bindhost = 0x0 datadir = 0x7ffffff465b7 "/gfiler/nfsen/profiles-data/live/csr1_2wr" pidstr = "7752\n", '\0' <repeats 15 times>, "x86_64\000\000\000\000\000" launch_process = 0x0 userid = 0x7ffffff46596 "monitor" groupid = 0x7ffffff430c0 "/gfiler/nfsen/var/run/csr1_2wr.pid" checkptr = 0x7ffffff465ae "" listenport = 0x7ffffff4658e "9803" mcastgroup = 0x0 extension_tags = <value optimized out> Ident = 0x553010 "csr1_2wr" ---Type <return> to continue, or q <return> to quit--- pidfile = "/gfiler/nfsen/var/run/csr1_2wr.pid", '\0' <repeats 4061 times> fstat = {st_dev = 19, st_ino = 14184345, st_nlink = 3, st_mode = 16893, st_uid = 66, st_gid = 8, pad0 = 0, st_rdev = 0, st_size = 4096, st_blksize = 32768, st_blocks = 8, st_atim = {tv_sec = 1245391628, tv_nsec = 781259000}, st_mtim = {tv_sec = 1260827001, tv_nsec = 717212000}, st_ctim = { tv_sec = 1260827001, tv_nsec = 717212000}, __unused = {0, 0, 0}} peer = {addr = {ss_family = 0, __ss_align = 0, __ss_padding = '\0' <repeats 111 times>}, addrlen = 0, sockfd = 0, family = 0, port = 0x0, hostname = 0x0, mcast = 0, flush = 0, send_buffer = 0x0, writeto = 0x0, endp = 0x0} fs = (FlowSource_t *) 0x0 act = {__sigaction_handler = {sa_handler = 0x402a40 <IntHandler>, sa_sigaction = 0x402a40 <IntHandler>}, sa_mask = {__val = {0 <repeats 16 times>}}, sa_flags = 0, sa_restorer = 0} family = <value optimized out> bufflen = 200000 twin = 300 t_start = 1260826800 sock = 7 synctime = 1 do_daemonize = 0 expire = <value optimized out> subdir_index = 1 sampling_rate = 1 compress = 1 c = <value optimized out> I experimented around a bit and found out that nfcapd seems to run fine as soon as I add "-T +14" (or "all", or more options, but 14 has to be included) Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: 2 byte input/output interface index Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: 4 byte input/output interface index Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: 2 byte src/dst AS number Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: 4 byte src/dst AS number Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: dst tos, direction, src/dst mask Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: IPv4 next hop Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: IPv6 next hop Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: IPv4 BGP next IP Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: IPv6 BGP next IP Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: src/dst vlan id Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: 4 byte output packets Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: 8 byte output packets Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: 4 byte output bytes Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: 8 byte output bytes Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: 4 byte aggregated flows Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: 8 byte aggregated flows Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: in src/out dst mac address Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: in dst/out src mac address Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: MPLS Labels Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: IPv4 router IP addr Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: IPv6 router IP addr Dec 14 22:46:32 flowbert nfcapd[7828]: Add extension: router ID Dec 14 22:46:32 flowbert nfcapd[7828]: Bound to IPv4 host/IP: any, Port: 9803 Dec 14 22:46:32 flowbert nfcapd[7828]: Standard setsockopt, SO_RCVBUF is 137216 Requested length is 200000 bytes Dec 14 22:46:32 flowbert nfcapd[7828]: System set setsockopt, SO_RCVBUF to 262142 bytes Dec 14 22:46:32 flowbert nfcapd[7828]: Startup. Dec 14 22:46:32 flowbert nfcapd[7828]: Init v9: Recognised number of v9 tags: 54 Dec 14 22:46:32 flowbert nfcapd[7828]: Process_v9: New exporter domain 517 from: 129.187.0.1 Dec 14 22:46:32 flowbert nfcapd[7828]: Process_v9: [517] Add template 257 Dec 14 22:46:32 flowbert nfcapd[7828]: Process_v9: [517] Add template 256 Dec 14 22:46:33 flowbert nfcapd[7828]: Process_v9: [517] Add template 258 Dec 14 22:46:35 flowbert nfcapd[7828]: Process_v9: New exporter domain 0 from: 129.187.0.1 Dec 14 22:46:45 flowbert nfcapd[7828]: Process_v9: [0] Add template 256 Dec 14 22:46:48 flowbert nfcapd[7828]: Process_v9: [0] Add template 257 [ Ctrl+C ] Dec 14 22:47:04 flowbert nfcapd[7828]: Ident: 'csr1_2wr' Flows: 74302, Packets: 30953144, Bytes: 31378122253, Sequence Errors: 0, Bad Packets: 0 The boxes get fed by Cisco Catalyst 6500 series, mostly running SXI* IOS and exporting dual-stack netflow v9. Any ideas? The system is really old and there might just have been a botched update in those 1000 days uptime that took effect after reboot, but it feels funny. Bernhard ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ Nfdump-discuss mailing list Nfdump-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfdump-discuss