The attached patch is for openafs 1.2.10, but applies to my not too recent 1.3.something as well:
in rx_packet.c:rxi_ReceiveDebugPacket() the rx_idleServerQueue is scanned without taking a lock first. The queue_Remove macro zeroes the ->next field (luckily), but this can lead to a crash when Murphy strikes and the entry gets removed during the scan: queue_Scan will dereference 0x0 -> next. Actually: from experience I would argue that the number of "idle threads" is pretty useless information anyway (as opposed to e.g. the number of calls waiting!), so why count them. In that case the "fix" would be to stuff 42 or whatever in there... -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Rainer Toebbicke European Laboratory for Particle Physics(CERN) - Geneva, Switzerland Phone: +41 22 767 8985 Fax: +41 22 767 7155
*** openafs/src/rx/rx_packet.c.orig Fri May 23 08:52:31 2003 --- openafs/src/rx/rx_packet.c Wed May 26 17:49:03 2004 *************** *** 1142,1147 **** --- 1142,1148 ---- #ifndef RX_ENABLE_LOCKS tstat.waitingForPackets = rx_waitingForPackets; #endif + MUTEX_ENTER(&rx_serverPool_lock); tstat.nFreePackets = htonl(rx_nFreePackets); tstat.callsExecuted = htonl(rxi_nCalls); tstat.packetReclaims = htonl(rx_packetReclaims); *************** *** 1149,1154 **** --- 1150,1156 ---- tstat.nWaiting = htonl(rx_nWaiting); queue_Count( &rx_idleServerQueue, np, nqe, rx_serverQueueEntry, tstat.idleThreads); + MUTEX_EXIT(&rx_serverPool_lock); tstat.idleThreads = htonl(tstat.idleThreads); tl = sizeof(struct rx_debugStats) - ap->length; if (tl > 0)
