(Yeah, this message is in html. I don't usually do it, but for showing an example of an hmtl page, it's the easiest way...)
For tracking down deadlocks, I've added this new page http://127.0.0.1:3000/showMutex.html to the version in the ntop cvs. (previously, the mutex report was only generated if the -K flag was set, which has other consequences).
Note that just because something is 'blocked' for an instant doesn't imply a deadlock. That is COMPLETELY NORMAL behavior.
The normal output looks like this:
| Mutex Name | State | Last Lock | Blocked | Last UnLock | # Locks/Releases | Max Lock | |
|---|---|---|---|---|---|---|---|
| gdbmMutex | unlocked | at Fri 31 Oct 2003 10:53:41 AM CST util.c:3386(13349) | util.c:3393(13349) | 100 | 100 | 0 sec [util.c:3413] | |
| packetProcessMutex | unlocked | at Fri 31 Oct 2003 10:53:41 AM CST pbuf.c:1631(13350) | pbuf.c:1646(13350) | 477 | 477 | 1 sec [pbuf.c:1646] | |
| packetQueueMutex | unlocked | at Fri 31 Oct 2003 10:53:37 AM CST pbuf.c:1744(13344) | pbuf.c:1764(13344) | 144 | 144 | 0 sec [pbuf.c:1689] | |
| purgeMutex | locked | at Fri 31 Oct 2003 10:53:41 AM CST http.c:2584(13349) | :0(0) | 1 | 0 | 0 sec [:0] | |
| hostsHashMutex | unlocked | at Fri 31 Oct 2003 10:53:41 AM CST pbuf.c:1998(13350) | pbuf.c:2703(13350) | 477 | 477 | 1 sec [pbuf.c:2703] | |
| tcpSessionsMutex | unlocked | at Fri 31 Oct 2003 10:53:41 AM CST sessions.c:584(13350) | sessions.c:1975(13350) | 30 | 30 | 0 sec [sessions.c:1975] | |
| purgePortsMutex | unlocked | at Fri 31 Oct 2003 10:53:41 AM CST pbuf.c:597(13350) | pbuf.c:618(13350) | 57 | 57 | 0 sec [pbuf.c:618] | |
| securityItemsMutex | unlocked | at Fri 31 Oct 2003 10:53:41 AM CST http.c:2210(13349) | http.c:2223(13349) | 2 | 2 | 0 sec [http.c:2205] | |
The give-away of a deadlock would be a line like this:
Mutex Name
State
Last Lock
Blocked
Last UnLock
# Locks/Releases
Max Lock
gdbmMutex
unlocked
at Fri 31 Oct 2003 10:53:41 AM CST util.c:3386(13349)
util.c:3393(13349)
100
100
0 sec [util.c:3413]
packetProcessMutex
unlocked
at Fri 31 Oct 2003 10:53:41 AM CST pbuf.c:1631(13350)
pbuf.c:1646(13350)
477
477
1 sec [pbuf.c:1646]
packetQueueMutex
unlocked
at Fri 31 Oct 2003 10:53:37 AM CST pbuf.c:1744(13344)
pbuf.c:1764(13344)
144
144
0 sec [pbuf.c:1689]
purgeMutex
locked
at Fri 31 Oct 2003 10:53:41 AM CST http.c:2584(13349)
:0(0)
1
0
0 sec [:0]
hostsHashMutex
locked
at Fri 31 Oct 2003 10:52:17 AM CST hash.c:552(13350)
pbuf.c:1998 (13355)
hash.c:601 (13350)
477
477
1 sec [pbuf.c:2703]
tcpSessionsMutex
unlocked
at Fri 31 Oct 2003 10:53:41 AM CST
sessions.c:584(13350)
sessions.c:1975(13350)
30
30
0 sec [sessions.c:1975]
purgePortsMutex
unlocked
at Fri 31 Oct 2003 10:53:41 AM CST pbuf.c:597(13350)
pbuf.c:618(13350)
57
57
0 sec [pbuf.c:618]
securityItemsMutex
unlocked
at Fri 31 Oct 2003 10:53:41 AM CST http.c:2210(13349)
http.c:2223(13349)
2
2
0 sec
[http.c:2205]
See how the 'at' tag is OLD and there's something in the blocked line? If you refresh a few times over 30s or a minute and that doesn't change, you're deadlocked.
The KEY to reporting these is to go into the source and pull up the actual lines - plus some context - where the Last Lock, Last UnLock and Block show up. (Since the cvs changes often, line #s aren't very useful without the corresponding code).
For example, the report of the above should be:
Deadlock (web server active), showMutex.html shows hostsHashMutex, locked at hash.c:552:
for(idx=0;
idx<myGlobals.device[actDevice].actualHashSize; idx++) {
#ifdef
CFG_MULTITHREADED
accessMutex(&myGlobals.hostsHashMutex, "scanIdleLoop");
#endif
Last Unlock at hash.c:601:
if(numHosts >= (maxHosts-1)) break;
Blocked at pbuf.c:1998:
incrementTrafficCounter(&myGlobals.device[actualDeviceId].rcvdPktStats.tooLong,
1);
}
#ifdef CFG_MULTITHREADED
accessMutex(&myGlobals.hostsHashMutex,
"processPacket");
#endif
#ifdef DEBUG
traceEvent(CONST_TRACE_INFO, "actualDeviceId = %d",
actualDeviceId);
#endif
-----Burton
