I'm having trouble with ntop dying.  I ran it through gdb and come up with
the information listed below.  I ran it once from within gdb and once from
outside, then pulling the core into gdb.  I'm running it on RedHat AS 3 and
built it myself.  Ntop crashes pretty easily so if there is something I
should try let me know and I can try it.


Run inside gdb:


Error type: Segmentation fault

(gdb) info thread
......
  9 Thread -1328637008 (LWP 19784)  0xb75ebc32 in _dl_sysinfo_int80 () from
/lib/ld-linux.so.2
  8 Thread -1315648592 (LWP 19783)  0xb75ebc32 in _dl_sysinfo_int80 () from
/lib/ld-linux.so.2
* 7 Thread -1304302672 (LWP 19782)  0xb5becaaa in _int_malloc () from
/lib/tls/libc.so.6
  6 Thread -1292686416 (LWP 19781)  0xb75ebc32 in _dl_sysinfo_int80 () from
/lib/ld-linux.so.2
  5 Thread -1282196560 (LWP 19780)  0xb75ebc32 in _dl_sysinfo_int80 () from
/lib/ld-linux.so.2
....

(gdb) thread 7
[Switching to thread 7 (Thread -1304302672 (LWP 19782))]#2  0xb5f62062 in
ntop_safemalloc (sz=1,
    file=0xb5f8d5c5 "hash.c", line=954) at leaks.c:75
75        thePtr = malloc(sz);

(gdb) list
70                               */
71        }
72      #endif
73
74      #ifndef USE_GC
75        thePtr = malloc(sz);
76      #else
77        thePtr = GC_malloc_atomic(sz);
78      #endif
79

(gdb) print sz
$4 = 1

(gdb) bt
#0  0xb5becaaa in _int_malloc () from /lib/tls/libc.so.6
#1  0xb5bebdfd in malloc () from /lib/tls/libc.so.6
#2  0xb5f62062 in ntop_safemalloc (sz=1, file=0xb5f8d5c5 "hash.c",
line=954) at leaks.c:75
#3  0xb5f5b4dc in _lookupHost (hostIpAddress=0xb241cd10, ether_addr=0x0,
vlanId=0,
    checkForMultihoming=0 '\0', forceUsingIPaddress=1 '\001',
actualDeviceId=1,
    file=0xb250b639 "netflowPlugin.c", line=497) at hash.c:954
#4  0xb25051ad in handleGenericFlow (recordActTime=869082435,
recordSysUpTime=-1783133275,
    record=0xb241dab0, deviceId=1) at netflowPlugin.c:497
#5  0xb2505c62 in dissectFlow (buffer=0xb241e200 "", bufferLen=1464,
deviceId=1) at netflowPlugin.c:1276
#6  0xb2506b7f in netflowMainLoop (_deviceId=0x1) at netflowPlugin.c:1469
#7  0xb5d0cdac in start_thread () from /lib/tls/libpthread.so.0
#8  0xb5c569ea in clone () from /lib/tls/libc.so.6


Run from the command line:

Mon Sep 12 14:21:15 2005  THREADMGMT[t2733636528]: RRD: Throughput data
collection: Thread running [p19952]
Mon Sep 12 14:21:15 2005  THREADMGMT[t2733636528]: RRD: Started thread for
throughput data collection
Mon Sep 12 14:21:15 2005  THREADMGMT[t2762632112]: RRD: Data collection
thread running [p19952]
Segmentation fault (core dumped)
# ls -lt | head
total 306628
-rw-------    1 root     root     565616640 Sep 12 14:30 core.19952
....

# gdb /home/hc05/ntop-mine/bin/ntop core.19952
GNU gdb Red Hat Linux (5.3.90-0.20030710.40rh)

(gdb) bt
#0  0xb5bf4857 in memset () from /lib/tls/libc.so.6
#1  0x104eec48 in ?? ()
#2  0xb5f839b1 in resetUsageCounter (counter=0x49c) at util.c:3690
#3  0xb5f83a6c in resetSecurityHostTraffic (el=0x104eec48) at util.c:3713
#4  0xb5f65802 in allocateSecurityHostPkts (srcHost=0x104eec48) at
pbuf.c:123
#5  0xb2504d5c in handleGenericFlow (recordActTime=685188419,
recordSysUpTime=-1795279914,
    record=0xaea8aab0, deviceId=6) at netflowPlugin.c:612
#6  0xb2505c62 in dissectFlow (buffer=0xaea8b200 "", bufferLen=1464,
deviceId=6) at netflowPlugin.c:1276
#7  0xb2506b7f in netflowMainLoop (_deviceId=0x6) at netflowPlugin.c:1469
#8  0xb5d0cdac in start_thread () from /lib/tls/libpthread.so.0
#9  0xb5c569ea in clone () from /lib/tls/libc.so.6


(gdb) frame 2
#2  0xb5f839b1 in resetUsageCounter (counter=0x49c) at util.c:3690
3690      memset(counter, 0, sizeof(UsageCounter));
(gdb) list
3685    /* ******************************* */
3686
3687    void resetUsageCounter(UsageCounter *counter) {
3688      int i;
3689
3690      memset(counter, 0, sizeof(UsageCounter));
3691
3692      for(i=0; i<MAX_NUM_CONTACTED_PEERS; i++)
3693        setEmptySerial(&counter->peersSerials[i]);
3694    }
(gdb) print counter
$1 = (UsageCounter *) 0x49c

(gdb) frame 3
#3  0xb5f83a6c in resetSecurityHostTraffic (el=0x104eec48) at util.c:3713
3713      resetUsageCounter(&el->secHostPkts->nullPktsSent);
(gdb) print el
$2 = (HostTraffic *) 0x104eec48
(gdb) print el->secHostPkts->nullPktsSent
Cannot access memory at address 0x49c
(gdb) print el->secHostPkts
$3 = (SecurityHostProbes *) 0x0
(gdb)

The NULL pointer looks like trouble.


(gdb) frame 4
#4  0xb5f65802 in allocateSecurityHostPkts (srcHost=0x104eec48) at
pbuf.c:123
123         resetSecurityHostTraffic(srcHost);
(gdb) list
118     /* ******************************* */
119
120     void allocateSecurityHostPkts(HostTraffic *srcHost) {
121       if(srcHost->secHostPkts == NULL) {
122         if((srcHost->secHostPkts =
(SecurityHostProbes*)malloc(sizeof(SecurityHostProbes))) == NULL) return;
123         resetSecurityHostTraffic(srcHost);
124       }
125     }
126
127     /* ************************************ */
(gdb) print srcHost->secHostPkts
$4 = (SecurityHostProbes *) 0x0

I don't know why the test on line 122 didn't catch the NULL.  Also, I don't
know why the malloc failed unless it is because the core usage is really
big.

_______________________________________________
Ntop-dev mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-dev

Reply via email to